Methods for increasing cas9-mediated engineering efficiency

ABSTRACT

Methods for use with Type II CRISPR-Cas9 systems for increasing Cas9-mediated genome engineering efficiency are disclosed. The methods can be used to decrease the number of off-target nucleic acid double-stranded breaks and/or to enhance homology-directed repair of a cleaved target nucleic acid.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims the benefit under 35 U.S.C. § 119(e)(1) of U.S.Provisional Application Nos. 62/042,358, filed Aug. 27, 2014 and62/047,495, filed Sep. 8, 2014, each of which applications isincorporated herein by reference in its entirety.

TECHNICAL FIELD

The present invention relates to Type II CRISPR-Cas9 systems for use inincreasing Cas9-mediated genome engineering efficiency by eitherdecreasing the number of off-target nucleic acid double-stranded breaks,and/or enhancing homology-directed repair of a cleaved target nucleicacid.

BACKGROUND OF THE INVENTION

Clustered regularly interspaced short palindromic repeats (CRISPR) andassociated Cas9 proteins constitute the CRISPR-Cas9 system. This systemprovides adaptive immunity against foreign DNA in bacteria (Barrangou,R., et al., “CRISPR provides acquired resistance against viruses inprokaryotes,” Science (2007) 315:1709-1712; Makarova, K. S., et al.,“Evolution and classification of the CRISPR-Cas systems,” Nat RevMicrobiol (2011) 9:467-477; Garneau, J. E., et al., “The CRISPR/Casbacterial immune system cleaves bacteriophage and plasmid DNA,” Nature(2010) 468:67-71; Sapranauskas, R., et al., “The Streptococcusthermophilus CRISPR/Cas system provides immunity in Escherichia coli,”Nucleic Acids Res (2011) 39: 9275-9282).

The RNA-guided Cas9 endonuclease specifically targets and cleaves DNA ina sequence-dependent manner (Gasiunas, G., et al., “Cas9-crRNAribonucleoprotein complex mediates specific DNA cleavage for adaptiveimmunity in bacteria,” Proc Natl Acad Sci USA (2012) 109: E2579-E2586;Jinek, M., et al., “A programmable dual-RNA-guided DNA endonuclease inadaptive bacterial immunity,” Science (2012) 337:816-821; Sternberg, S.H., et al., “DNA interrogation by the CRISPR RNA-guided endonucleaseCas9,” Nature (2014) 507:62; Deltcheva, E., et al., “CRISPR RNAmaturation by trans-encoded small RNA and host factor RNase III,” Nature(2011) 471:602-607), and has been widely used for programmable genomeediting in a variety of organisms and model systems (Cong, L., et al.,“Multiplex genome engineering using CRISPR/Cas systems,” Science (2013)339:819-823; Jiang, W., et al., “RNA-guided editing of bacterial genomesusing CRISPR-Cas systems,” Nat. Biotechnol. (2013) 31: 233-239; Sander,J. D. & Joung, J. K., “CRISPR-Cas systems for editing, regulating andtargeting genomes,” Nature Biotechnol. (2014) 32:347-355).

Jinek, M., et al., (“A programmable dual-RNA-guided DNA endonuclease inadaptive bacterial immunity,” Science (2012) 337:816-21) showed that ina subset of CRISPR-associated (Cas) systems, the mature crRNA that isbase-paired to trans-activating crRNA (tracrRNA) forms a two-part RNAstructure, also called “dual-guide,” that directs the CRISPR-associatedprotein Cas9 to introduce double-stranded breaks in target DNA. At sitescomplementary to the crRNA-guide (spacer) sequence, the Cas9 HNHnuclease domain cleaves the complementary strand and the Cas9 RuvC-likedomain cleaves the non-complementary strand. Dual-crRNA/tracrRNAmolecules were engineered into single-chain crRNA/tracrRNA molecules.These single-chain crRNA/tracrRNA directed target sequence-specific Cas9double-strand DNA cleavage.

However, site-specific nucleases such as Cas9 can introducedouble-stranded breaks in DNA in unintended and/or incorrect locations,termed “off-target effects.” Accordingly, methods to reduce or eliminateoff-target DNA breaks are highly desirable.

Additionally, DNA double-stranded breaks can be repaired by, forexample, non-homologous end joining (NHEJ) or homology-directed repair(HDR). Faithful repair by HDR is inefficient at site-directed breaks ofthe target nucleic acid because other cellular mechanisms may result inthe incorporation of nucleic acids at the site of a double-strandedbreak or a single-stranded nick. It is apparent there is a clear need todevelop novel strategies that mitigate or eliminate off-target genomeediting events and increase the efficiency of inserting new materialinto the sites cut by site-directed nucleases such as Cas9.

SUMMARY

In one aspect, the disclosure provides for a method for reducingoff-targeting nuclease cleavage comprising: contacting a first complexcomprising a catalytically active Cas9 and a guide RNA with a targetnucleic acid; contacting a second complex comprising a catalyticallyinactive Cas9 (dCas9) and a guide RNA with an off-target nucleic acid;and cleaving the target nucleic acid with the first complex, wherein thesecond complex prevents the first complex from cleaving the off-targetnucleic acid. In some embodiments, the active Cas9 comprises at least25% amino acid identity to the HNH and RuvC active site motifs of a Cas9from Streptococcus pyogenes, such as at least 50%, 75%, 95%, 99% andcomplete amino acid identity, or any percentage between 25% and 100%, toa Cas9 from S. pyogenes.

In some embodiments, the active Cas9 comprises at least 25% amino acididentity to the HNH and RuvC active site motifs of a Cas9 fromStreptococcus thermophilus, such as at least 50%, 75%, 95%, 99% andcomplete amino acid identity, or any percentage between 25% and 100%, toa Cas9 from S. thermophilus. In some embodiments, the active Cas9comprises at least 25% amino acid identity to the HNH and RuvC activesite motifs of a Cas9 from Staphylococcus aureus, such as at least 50%,75%, 95%, 99% and complete amino acid identity, or any percentagebetween 25% and 100%, to a Cas9 from S. aureus. In some embodiments, theactive Cas9 comprises at least 25% amino acid identity to the HNH andRuvC active site motifs of a Cas9 from Neisseria meningitidis, such asat least 50%, 75%, 95%, 99% and complete amino acid identity, or anypercentage between 25% and 100%, to a Cas9 from N. meningitidis.

In some embodiments, the catalytically inactive Cas9 comprises amutation in one or both of its nuclease domains. In some embodiments,the dCas9 is at least 80% catalytically inactive compared to a wild-typeCas9.

In some embodiments, the first complex is capable of binding to theoff-target nucleic acid. In some embodiments, the binding and/orcleavage of the first complex to the off-target nucleic acid is reducedby at least 30%. In some embodiments, the binding of the first complexto the off-target nucleic acid is reduced by at least 70%.

In some embodiments, the cleaving comprises introducing adouble-stranded break. In some embodiments, the cleaving comprisesintroducing a single-stranded break. In some embodiments, the targetnucleic acid is DNA. In some embodiments, the target nucleic acid isdouble-stranded DNA.

In another aspect, the disclosure provides for a composition comprising:two site-directed polypeptides to Cas9, wherein the two site-directedpolypeptides comprise a mutation in one of their nuclease domains,wherein the two site-directed polypeptides are configured to bind andcleave the same strand of a double-stranded target nucleic acid.

In some embodiments, the two site-directed polypeptides comprise atleast 10% amino acid identity to a nuclease domain of Cas9 from S.pyogenes.

In some embodiments, the mutation comprises a D10A mutation. In someembodiments, the mutation comprises an H840A mutation. In someembodiments, the target nucleic acid is DNA.

In some embodiments, the two site-directed polypeptides are bound to thesense strand of the double-stranded target nucleic acid. In someembodiments, the two site-directed polypeptides are bound to theanti-sense strand of the double-stranded target nucleic acid. In someembodiments, the composition further comprises a donor polynucleotide.In some embodiments, the donor polynucleotide is single-stranded. Insome embodiments, the donor polynucleotide is double-stranded. In someembodiments, the donor polynucleotide is partially single-stranded andpartially double-stranded.

In another embodiment, a method for reducing binding and/or cleavage ofan off-target nucleic acid by a complex comprising a catalyticallyactive Cas9 protein and a guide polynucleotide, is provided. The methodcomprises: (a) contacting a first complex with a selected target nucleicacid, wherein said first complex comprises: (i) a catalytically activeCas9 protein and (ii) a first guide polynucleotide, such as sgRNA, thatcomprises a spacer adapted to bind to said selected target nucleic acid;and (b) contacting a second complex with an off-target nucleic acid,wherein said second complex comprises (i) a catalytically inactive Cas9protein (dCas9 protein) that does not cleave the off-target nucleic acidand (ii) a second guide polynucleotide, such as sgRNA, that comprises aspacer adapted to bind to said off-target nucleic acid, thereby reducingbinding and/or cleavage by said first complex of said off-target nucleicacid.

In other embodiments, the catalytically active Cas9 protein comprises atleast 75% amino acid identity to a Cas9 from S. pyogenes, with theproviso that the Cas9 protein retains catalytic activity. In certainembodiments, the catalytically active Cas9 protein comprises at least95% amino acid identity to a Cas9 from S. pyogenes, with the provisothat the Cas9 protein retains catalytic activity. In additionalembodiments of the method, the catalytically active Cas9 protein is a S.pyogenes Cas9 protein or an orthologous Cas9 protein.

In further embodiments, the dCas9 protein comprises at least onemutation in one or more endonuclease domains to render the dCas9 proteincatalytically inactive. In some embodiments, the dCas9 protein comprisesat least 75% amino acid identity to a Cas9 protein from S. pyogenes. Inother embodiments, the dCas9 protein comprises at least 75% amino acididentity to a Cas9 protein from S. pyogenes. In additional embodiments,the dCas9 protein is a S. pyogenes Cas9 protein or an orthologous Cas9protein with at least one mutation in one or more endonuclease domainsto render the orthologous Cas9 protein catalytically inactive. Incertain embodiments, the one or more mutations is in a RuvC-1 domain,such as a D1OA mutation, numbered relative to S. pyogenes Cas9, or thecorresponding mutation in an orthologous Cas9 protein. In otherembodiments, the one or more mutations is in the HNH domain, such as aH840A mutation, numbered relative to S. pyogenes Cas9, or thecorresponding mutation in an orthologous Cas9 protein. In additionalembodiments, the one or more mutations comprises a D10A mutation and aH840A mutation, numbered relative to S. pyogenes Cas9, or thecorresponding mutations in an orthologous Cas9 protein.

In additional embodiments, the selected target nucleic acid is DNA, suchas double-stranded DNA.

In further embodiments, the selected target nucleic acid is cleaved toprovide a cleavage site and the method further comprises modifying thetarget nucleic acid, such as by inserting at least a portion of thedonor polynucleotide at the cleavage site. In other embodiments, themodifying comprises deleting one or more nucleotides at the cleavagecite.

In additional embodiments, the method is performed in a cell, such as aeukaryotic cell, or in vitro.

In another embodiment, a method for modifying a target nucleic acid isprovided comprising: contacting two complexes to the same strand of thetarget nucleic acid, wherein each of the two complexes comprises asite-directed polypeptide and a nucleic acid-targeting nucleic acid,wherein the two site-directed polypeptides comprise a mutation in one oftheir nuclease domains; and modifying the target nucleic acid. In someembodiments, the nucleic acid-targeting nucleic acid from one of the twocomplexes targets a different location in the target nucleic acid thanthe nucleic acid-targeting nucleic acid from the other of the twocomplexes.

In some embodiments, the two site-directed polypeptides comprise atleast 75% amino acid identity to Cas9 from S. pyogenes. In someembodiments, the two site-directed polypeptides comprise at least 10%amino acid identity to a nuclease domain of Cas9 from S. pyogenes. Insome embodiments, the mutation comprises a D10A mutation. In someembodiments, the mutation comprises an H840A mutation. In someembodiments, the target nucleic acid is DNA.

In some embodiments, the two site-directed polypeptides are bound to thesense strand of the double-stranded target nucleic acid. In someembodiments, the two site-directed polypeptides are bound to theanti-sense strand of the double-stranded target nucleic acid. In someembodiments, the modifying comprises cleaving the same strand of thetarget nucleic acid. In some embodiments, the cleaving comprises asingle-stranded break. In some embodiments, the method further comprisesinserting a donor polynucleotide into the target nucleic acid. In someembodiments, the donor polynucleotide is single-stranded. In someembodiments, the donor polynucleotide is double-stranded. In someembodiments, the donor polynucleotide is partially single-stranded andpartially double-stranded.

In another embodiment, the invention is directed to a method forcleaving a single strand of a target nucleic acid comprising contactingfirst and second complexes at spaced-apart locations on the same strandof a nucleic acid molecule. The first complex comprises (i) a first Cas9protein with a mutation in an endonuclease domain thereof to render theCas9 protein a nickase; and (ii) a first guide polynucleotide, such assgRNA, that comprises a spacer adapted to bind to a first target nucleicacid. The second complex comprises (i) a second Cas9 protein with amutation in an endonuclease domain thereof, to render the Cas9 protein anickase; and (ii) a second guide polynucleotide, such as sgRNA, thatcomprises a spacer adapted to bind to a second target nucleic; whereinthe first and second Cas9 proteins cleave a single strand of saidnucleic acid molecule at the spaced-apart locations on the same strand,to render a single-stranded break.

In some embodiments, the first Cas9 protein and/or the second Cas9protein comprises at least 75% amino acid identity to a Cas9 from S.pyogenes. In certain embodiments, the Cas9 protein comprises at least95% amino acid identity to a Cas9 from S. pyogenes. In additionalembodiments of the method, the first Cas9 protein and/or the second Cas9protein is a S. pyogenes Cas9 protein or an orthologous Cas9 proteinwith a mutation in an endonuclease domain thereof, to render theorthologous Cas9 protein a nickase. In certain embodiments, the one ormore mutations is in a RuvC-1 domain, such as a D1OA mutation, numberedrelative to S. pyogenes Cas9, or the corresponding mutation in anorthologous Cas9 protein. In other embodiments, the one or moremutations is in the HNH domain, such as a H840A mutation, numberedrelative to S. pyogenes Cas9, or the corresponding mutation in anorthologous Cas9 protein.

In further embodiments, the target nucleic acid is double-stranded DNAand the complexes bind to and cleave the anti-sense strand of thedouble-stranded DNA. In other embodiments, the target nucleic acid isdouble-stranded DNA and the complexes bind to and cleave the sensestrand of the double-stranded DNA.

In additional embodiments, the method further comprises modifying thetarget nucleic acid, such as by inserting at least a portion of thedonor polynucleotide into the target nucleic acid at the single-strandedbreak. In certain embodiments, the donor polynucleotide issingle-stranded. In further embodiments, the inserting is done usinghomology-directed repair of the donor polynucleotide with the targetnucleic acid.

In additional embodiments, the method is performed in a cell, such as aeukaryotic cell, or in vitro.

In yet further embodiments, a method for directed homology-directedrepair (HDR) in a target nucleic acid is provided. The method comprises:(a) contacting a first complex with a first target nucleic acid, whereinsaid first complex comprises: (i) a catalytically active Cas9 proteinand (ii) a first guide polynucleotide, such as a sgRNA, that comprises aspacer adapted to bind to said first target nucleic acid, wherein saidfirst complex cleaves the first target nucleic acid; and (b) contactinga second complex with a second target nucleic acid, wherein said secondcomplex comprises: (i) a first catalytically inactive Cas9 protein(dCas9 protein) that comprises at least one mutation in one or moreendonuclease domains to render the dCas9 protein catalytically inactivesuch that the dCas9 protein does not cleave the second target nucleicacid, and (ii) a second guide polynucleotide, such as sgRNA, thatcomprises a spacer adapted to bind to said second target nucleic acid,wherein the second complex comprises one end of a polynucleotide donorassociated therewith and configured in proximity to the cleaved firsttarget nucleic acid; wherein at least a portion of the polynucleotidedonor is inserted into the first target nucleic acid via HDR.

In certain embodiments, the second target nucleic acid is upstream ofthe first target nucleic acid. In other embodiments, the second targetnucleic acid is downstream of the first target nucleic acid.

In certain embodiments of the above method above, the 5′ end of thepolynucleotide donor is associated with the second complex. In otherembodiments, the 3′ end of the polynucleotide donor is associated withthe second complex.

In additional embodiments, the method further comprises: (c) contactinga third complex with a third target nucleic acid, wherein the thirdtarget nucleic acid is positioned downstream of the first target nucleicacid when the first target nucleic acid is downstream of the secondtarget nucleic acid, or wherein the third target nucleic acid ispositioned upstream of the first target nucleic acid when the firsttarget nucleic acid is upstream of the second target nucleic acid,wherein said third complex comprises: (i) a second dCas9 protein thatcomprises at least one mutation in one or more endonuclease domains torender the second dCas9 protein catalytically inactive such that thesecond dCas9 protein does not cleave the third target nucleic acid, and(ii) a third guide polynucleotide, such as sgRNA, that comprises aspacer adapted to bind to said third target nucleic acid, and whereinthe third complex comprises the other end of the polynucleotide donorassociated with the second complex. In certain embodiments, the 5′ endof the polynucleotide donor is associated with the second complex andthe 3′ end of the polynucleotide donor is associated with the thirdcomplex. In other embodiments, the 3′ end of the polynucleotide donor isassociated with the second complex and the 5′ end of the polynucleotidedonor is associated with the third complex.

In other embodiments, the Cas9 protein comprises at least 75% amino acididentity to a Cas9 from S. pyogenes, with the proviso that the Cas9protein retains catalytic activity. In certain embodiments, the Cas9protein comprises at least 95% amino acid identity to a Cas9 from S.pyogenes, with the proviso that the Cas9 protein retains catalyticactivity. In additional embodiments of the method, the Cas9 protein is aS. pyogenes Cas9 protein or an orthologous Cas9 protein.

In further embodiments, the dCas9 protein comprises at least 75% aminoacid identity to a Cas9 protein from S. pyogenes. In other embodiments,the dCas9 protein comprises at least 75% amino acid identity to a Cas9protein from S. pyogenes. In additional embodiments, the dCas9 proteinis a S. pyogenes Cas9 protein or an orthologous Cas9 protein with atleast one mutation in one or more endonuclease domains to render theorthologous Cas9 protein catalytically inactive. In certain embodiments,the one or more mutations is in a RuvC-1 domain, such as a D10Amutation, numbered relative to S. pyogenes Cas9, or the correspondingmutation in an orthologous Cas9 protein. In other embodiments, the oneor more mutations is in the HNH domain, such as a H840A mutation,numbered relative to S. pyogenes Cas9, or the corresponding mutation inan orthologous Cas9 protein. In additional embodiments, the one or moremutations comprises a D10A mutation and a H840A mutation, numberedrelative to S. pyogenes Cas9, or the corresponding mutations in anorthologous Cas9 protein.

In additional embodiments, the selected target nucleic acid is DNA, suchas double-stranded DNA.

In further embodiments, the method is performed in a cell, such as aeukaryotic cell, or in vitro.

These aspects and other embodiments of the methods for increasingCas9-mediated engineering efficiency and/or HDR repair will readilyoccur to those of ordinary skill in the art in view of the disclosureherein.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A and FIG. 1B present illustrative examples of Type II CRISPR-Cas9associated RNAs. FIG. 1A shows a two-RNA component Type II CRISPR-Cas9comprising a crRNA (FIG. 1A, 101) and a tracrRNA (FIG. 1A, 102),otherwise known as a dual-guide RNA. FIG. 1B illustrates the formationof base-pair hydrogen bonds between the crRNA and the tracrRNA to formsecondary structure (see U.S. Published Patent Application No.2014-0068797, published 6 Mar. 2014; see also Jinek M., et al., “Aprogrammable dual-RNA-guided DNA endonuclease in adaptive bacterialimmunity,” Science (2012) 337:816-821). The figures present an overviewof and nomenclature for secondary structural elements of the crRNA andtracrRNA of the S. pyogenes Cas9 including the following: a spacerelement (FIG. 1B, 103); a first stem element comprising a lower stemelement (FIG. 1B, 104), a bulge element comprising unpaired nucleotides(FIG. 1B, 105), and an upper stem element (FIG. 1B, 106); a nexuselement (FIG. 1B, 107); a second hairpin element comprising a secondstem element (FIG. 1B, 108); and a third hairpin element comprising athird stem element (FIG. 1B, 109). The figures are not proportionallyrendered nor are they to scale. The locations of indicators areapproximate.

FIG. 2 shows another example of a Type II CRISPR-Cas9 associated RNA.The figure illustrates a single-guide RNA (sgRNA) wherein the crRNA iscovalently joined to the tracrRNA and forms a RNA polynucleotidesecondary structure through base-pair hydrogen bonding (see, e.g., U.S.Published Patent Application No. 2014-0068797, published 6 Mar. 2014).The figure presents an overview of and nomenclature for secondarystructural elements of a sgRNA of the S. pyogenes Cas9 including thefollowing: a spacer element (FIG. 2, 201); a first stem elementcomprising a lower stem element (FIG. 2, 202), a bulge elementcomprising unpaired nucleotides (FIG. 2, 205), and an upper stem element(FIG. 2, 203); a loop element (FIG. 2, 204) comprising unpairednucleotides; (a first hairpin element comprises the first stem elementand the loop element); a nexus element (FIG. 2, 206); a second hairpinelement comprising a second stem element (FIG. 2, 207); and a thirdhairpin element comprising a third stem element (FIG. 2, 208). (See,e.g., FIGS. 1 and 3 of Briner, A. E., et al., “Guide RNA FunctionalModules Direct Cas9 Activity and Orthogonality,” Molecular Cell (2014)56:333-339.) The figure is not proportionally rendered nor is it toscale. The locations of indicators are approximate.

FIG. 3A and FIG. 3B relate to structural information for a sgRNA/Casprotein complex and a Cas protein, respectively. FIG. 3A provides amodel based on the crystal structure of S. pyogenes Cas9 (SpyCas9) in anactive complex with sgRNA (Anders C., et al., “Structural basis ofPAM-dependent target DNA recognition by the Cas9 endonuclease,” Nature(2014) 513:569-573). Structural studies of the SpyCas9 showed that theprotein exhibits a bi-lobed architecture comprising the Catalyticnuclease lobe and the a-Helical lobe of the enzyme (See Jinek M., etal., “Structures of Cas9 endonucleases reveal RNA-mediatedconformational activation,” Science (2014) 343:1247997; Anders C., etal., “Structural basis of PAM-dependent target DNA recognition by theCas9 endonuclease,” Nature (2014) 513:569-573). In FIG. 3A, theα-Helical lobe (FIG. 3A, Helical domain) is shown as the darker lobe;the Catalytic nuclease lobe (FIG. 3A, Catalytic nuclease lobe) is shownin a light grey and the sgRNA backbone is shown in black (FIG. 3A,sgRNA). The relative location of the 3′ end of the sgRNA is indicated(FIG. 3A, 3′ end sgRNA). The spacer RNA of the sgRNA is not visiblebecause it is surrounded by the two protein lobes. The relative locationof the 5′ end of the sgRNA (FIG. 3A, 5′ end sgRNA) is indicated and thespacer RNA of the sgRNA is located in the 5′ end region of the sgRNA. Acysteine residue (FIG. 3A, WT SpyCas9 Cys) in wild type SpyCas9 isidentified in the present disclosure as an available cross-linking site.In FIG. 3A, the Catalytic nuclease lobe is shown as the lighter lobewherein the relative positions of the RuvC (FIG. 3A, RuvC; RNase Hhomologous domain) and HNH nuclease (FIG. 3A, HNH; HNH nucleasehomologous domain) domains are indicated. The RuvC and HNH nucleasedomains, when active, each cut a different DNA strand in target DNA. TheC-terminal domain (FIG. 3A, CTD) is involved in recognition ofprotospacer adjacent motifs (PAM) in target DNA. FIG. 3B presents amodel of the domain arrangement of SpyCas9 relative to its primarysequence structure. In FIG. 3B, three regions of the primary sequencecorrespond to the RuvC domain (FIG. 3B, RuvC-I (amino acids 1-78),RuvC-II (amino acids 719-765), and RuvC-III (amino acids 926-1102)). Oneregion corresponds to the Helical domain (FIG. 3B, Helical Domain (aminoacids 79-718). One region corresponds to the HNH domain (FIG. 3B, HNH(amino acids 766-925). One region corresponds to the CTD domain (FIG.3B, CTD (amino acids 1103-1368). In FIG. 3B, the regions of the primarysequence corresponding to the α-Helical lobe (FIG. 3B, alpha-helicallobe) and the Nuclease domain lobe (FIG. 3B, Nuclease domain lobe) areindicated with brackets.

FIG. 4 depicts an exemplary embodiment of off-target binding andcleavage during genome engineering. In this embodiment, a target nucleicacid (FIG. 4. 115) is contacted with a complex comprising asite-directed polypeptide (e.g., Cas9) (FIG. 4, 105) and a nucleicacid-targeting nucleic acid (e.g., sgRNA or dual-guide RNA) (FIG. 4,110). The complex comprising the Cas9 binds to a target nucleic acid(FIG. 4, 120). In some instances, the complex binds to an off-targetnucleic acid (FIG. 4, 125). In a cleavage step (FIG. 4, 130), the Cas9of the complex can cleave the target nucleic acid (FIG. 4, 120) and theoff-target nucleic acid, thereby generating off-target effects.

FIG. 5 depicts an exemplary embodiment of a method of the disclosure forreducing off-target binding and cleavage events. A target nucleic acid(FIG. 5, 215) is contacted with a complex comprising a site-directedpolypeptide (e.g., an active Cas9) (FIG. 5, 205) and a nucleicacid-targeting nucleic acid (e.g., sgRNA or dual-guide RNA) (FIG. 5,210). The complex binds to a target nucleic acid (FIG. 5, 220). In someinstances, the complex comprising the Cas9 and sgRNA binds to anoff-target nucleic acid (FIG. 5, 225). Complexes comprising anengineered dCas9 protein (FIG. 5, 235) and an engineered sgRNA (FIG. 5,236), can be introduced and contacted (FIG. 5, 230) with the targetnucleic acid. The dCas9 complexes can either displace or prevent thebinding of complexes comprising active Cas9. The active Cas9 can cleave(FIG. 5, 240/245) the target nucleic acid. The active Cas9 is preventedfrom cleaving the off-target nucleic acid because the dCas9 ispreventing its binding and cleavage. In this way, off-target cleavagemay be prevented.

FIG. 6A, FIG. 6B, and FIG. 6C show the use of tandem Cas9 D10A nickasesto excise a single-stranded region of DNA on the same strand of a targetnucleic acid and insert a donor polynucleotide. FIG. 6A shows two D10AsgRNA/dCas9 complexes targeted to two spaced-apart positions on thesense strand of a target polynucleotide. FIG. 6B shows that a region onthe sense strand between the targeted sites has been cleaved. FIG. 6Cshows the insertion of the donor polynucleotide with overlappingflanking regions.

FIG. 7A and FIG. 7B depict methods of increasing HDR using sgRNA/dCas9and catalytically active sgRNA/Cas9 complexes. FIG. 7A shows a systemusing a single sgRNA/dCas9 complex tethered to a HDR polynucleotidedonor adjacent to an active sgRNA/Cas9 complex to direct the donor tothe site of the double-stranded break and to position the donor next tothe cut site. FIG. 7B shows a system using two spaced-apart sgRNA/dCas9complexes and a catalytically active sgRNA/Cas9 complex positionedbetween the two catalytically inactive complexes, wherein the donor ispositioned across the double-stranded break.

FIG. 8 shows the effects of dCas9 nuclease blockers (dCas9-NBs) on VEGFAsgRNA/Cas9 on-target editing at the VEGFA locus.

FIG. 9 shows the effects of dCas9-NBs on VEGFA sgRNA/Cas9 off-targetediting at a known VEGFA off-target locus on human chromosome 15.

FIG. 10 shows the various embodiments of the experimental conditionsused to position homology donor nucleotides near a targeted site forincreasing HDR efficiency, as described in Example 5C.

FIG. 11 shows potential donor configurations using tandem Cas9D10A asdescribed in the examples.

FIG. 12 shows a comparison of repair types using either Cas9 or Cas9D10Aat Targets 3 and 4 (human CD34 locus) as described in the examples. Negdenotes a negative control which is either Cas9 or Cas9D10A only,without sgRNA. The distribution of repair is denoted by the bars in thefigure. Solid bars=unedited; hatched bars=mutagenic repair; stippledbars=HDR.

DETAILED DESCRIPTION OF THE INVENTION

It is to be understood that the terminology used herein is for thepurpose of describing particular embodiments only, and is not intendedto be limiting. As used in this specification and the appended claims,the singular forms “a,” “an” and “the” include plural referents unlessthe context clearly dictates otherwise. Thus, for example, reference to“a sgRNA/dCas9 complex” includes one or more such complexes, referenceto “a sgRNA/Cas9 complex” includes one or more such complexes, referenceto “a mutation” includes one or more mutations, and the like. It is alsoto be understood that when reference is made to an embodiment using asgRNA to target Cas9 or dCas9 to a target site, one skilled in the artcan use an alternative embodiment of the invention based on the use of adual-guide RNA (e.g. crRNA/tracrRNA) in place of the sgRNA.

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which the invention pertains. Although other methods andmaterials similar, or equivalent, to those described herein can be usedin the practice of the present invention, preferred materials andmethods are described herein.

In view of the teachings of the present specification, one of ordinaryskill in the art can apply conventional techniques of immunology,biochemistry, chemistry, molecular biology, microbiology, cell biology,genomics, and recombinant polynucleotides, as taught, for example, bythe following standard texts: Antibodies: A Laboratory Manual, Secondedition, E. A. Greenfield, 2014, Cold Spring Harbor Laboratory Press,ISBN 978-1-936113-81-1; Culture of Animal Cells: A Manual of BasicTechnique and Specialized Applications, 6th Edition, R. I. Freshney,2010, Wiley-Blackwell, ISBN 978-0-470-52812-9; Transgenic AnimalTechnology, Third Edition: A Laboratory Handbook, 2014, C. A. Pinkert,Elsevier, ISBN 978-0124104907; The Laboratory Mouse, Second Edition,2012, H. Hedrich, Academic Press, ISBN 978-0123820082; Manipulating theMouse Embryo: A Laboratory Manual, 2013, R. Behringer, et al., ColdSpring Harbor Laboratory Press, ISBN 978-1936113019; PCR 2: A PracticalApproach, 1995, M. J. McPherson, et al., IRL Press, ISBN 978-0199634248;Methods in Molecular Biology (Series), J. M. Walker, ISSN 1064-3745,Humana Press; RNA: A Laboratory Manual, 2010, D. C. Rio, et al., ColdSpring Harbor Laboratory Press, ISBN 978-0879698911; Methods inEnzymology (Series), Academic Press; Molecular Cloning: A LaboratoryManual (Fourth Edition), 2012, M. R. Green, et al., Cold Spring HarborLaboratory Press, ISBN 978-1605500560; Bioconjugate Techniques, ThirdEdition, 2013, G. T. Hermanson, Academic Press, ISBN 978-0123822390;Methods in Plant Biochemistry and Molecular Biology, 1997, W. V. Dashek,CRC Press, ISBN 978-0849394805; Plant Cell Culture Protocols (Methods inMolecular Biology), 2012, V. M. Loyola-Vargas, et al., Humana Press,ISBN 978-1617798177; Plant Transformation Technologies, 2011, C. N.Stewart, et al., Wiley-Blackwell, ISBN 978-0813821955; RecombinantProteins from Plants (Methods in Biotechnology), 2010, C. Cunningham, etal., Humana Press, ISBN 978-1617370212; Plant Genomics: Methods andProtocols (Methods in Molecular Biology), 2009, D. J. Somers, et al.,Humana Press, ISBN 978-1588299970; Plant Biotechnology: Methods inTissue Culture and Gene Transfer, 2008, R. Keshavachandran, et al.,Orient Blackswan, ISBN 978-8173716164.

The term “Cas9 protein” as used herein refers to Type II CRISPR-Cas9proteins (as described, e.g., in Chylinski, K., (2013) “The tracrRNA andCas9 families of type II CRISPR-Cas immunity systems,” RNA Biol. 201310(5):726-737), including, but not limited to Cas9, Cas9-like, proteinsencoded by Cas9 orthologs, Cas9-like synthetic proteins, and variantsand modifications thereof. The term “Cas9 protein” as used herein refersto Cas9 wild-type proteins derived from Type II CRISPR-Cas9 systems,modifications of Cas9 proteins, variants of Cas9 proteins, Cas9orthologs, and combinations thereof. Cas9 proteins can be derived fromany of various bacterial species which genomes encode such proteins. Casproteins for use in the present methods are described further below.

The terms “wild-type,” “naturally-occurring” and “unmodified” are usedherein to mean the typical (or most common) form, appearance, phenotype,or strain existing in nature; for example, the typical form of cells,organisms, characteristics, polynucleotides, proteins, macromolecularcomplexes, genes, RNAs, DNAs, or genomes as they occur in and can beisolated from a source in nature. The wild-type form, appearance,phenotype, or strain serve as the original parent before an intentionalmodification. Thus, mutant, variant, engineered, recombinant, andmodified forms are not wild-type forms.

As used herein, the terms “engineered,” “genetically engineered,”“recombinant,” “modified,” and “non-naturally occurring” areinterchangeable and indicate intentional human manipulation.

As used herein, the terms “nucleic acid,” “nucleotide sequence,”“oligonucleotide,” and “polynucleotide” are interchangeable. All referto a polymeric form of nucleotides. The nucleotides may bedeoxyribonucleotides (DNA) or ribonucleotides (RNA), or analogs thereof,and they may be of any length. Polynucleotides may perform any functionand may have any secondary structure and three-dimensional structure.The terms encompass known analogs of natural nucleotides and nucleotidesthat are modified in the base, sugar and/or phosphate moieties. Analogsof a particular nucleotide have the same base-pairing specificity (e.g.,an analog of A base pairs with T). A polynucleotide may comprise onemodified nucleotide or multiple modified nucleotides. Examples ofmodified nucleotides include methylated nucleotides and nucleotideanalogs. Nucleotide structure may be modified before or after a polymeris assembled. Following polymerization, polynucleotides may beadditionally modified via, for example, conjugation with a labelingcomponent or target-binding component. A nucleotide sequence mayincorporate non-nucleotide components. The terms also encompass nucleicacids comprising modified backbone residues or linkages, that (i) aresynthetic, naturally occurring, and non-naturally occurring, and (ii)have similar binding properties as a reference polynucleotide (e.g., DNAor RNA). Examples of such analogs include, but are not limited to,phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methylphosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs),and morpholino structures.

Polynucleotide sequences are displayed herein in the conventional 5′ to3′ orientation.

As used herein, the term “complementarity” refers to the ability of anucleic acid sequence to form hydrogen bond(s) with another nucleic acidsequence (e.g., through traditional Watson-Crick base pairing). Apercent complementarity indicates the percentage of residues in anucleic acid molecule that can form hydrogen bonds with a second nucleicacid sequence. When two polynucleotide sequences have 100%complementarity, the two sequences are perfectly complementary, i.e.,all of a first polynucleotide's contiguous residues hydrogen bond withthe same number of contiguous residues in a second polynucleotide.

As used herein, the term “sequence identity” generally refers to thepercent identity of bases or amino acids determined by comparing a firstpolynucleotide or polypeptide to a second polynucleotide or polypeptideusing algorithms having various weighting parameters. Sequence identitybetween two polypeptides or two polynucleotides can be determined usingsequence alignment by various methods and computer programs (e.g.,BLAST, CS-BLAST, FASTA, HMMER, L-ALIGN, etc.), available through theworldwide web at sites including GENBANK (ncbi.nlm.nih.gov/genbank/) andEMBL-EBI (ebi.ac.uk.). Sequence identity between two polynucleotides ortwo polypeptide sequences is generally calculated using the standarddefault parameters of the various methods or computer programs.

As used herein a “stem-loop structure” or “stem-loop element” refers toa polynucleotide having a secondary structure that includes a region ofnucleotides that are known or predicted to form a double-stranded region(the “stem element”) that is linked on one side by a region ofpredominantly single-stranded nucleotides (the “loop element”). The term“hairpin” element is also used herein to refer to stem-loop structures.Such structures are well known in the art. The base pairing may beexact. However, as is known in the art, a stem element does not requireexact base pairing. Thus, the stem element may include one or more basemismatches or non-paired bases.

As used herein, the term “recombination” refers to a process of exchangeof genetic information between two polynucleotides.

As used herein, the term “homology-directed repair” or “HDR” refers toDNA repair that takes place in cells, for example, during repair ofdouble-stranded and single-stranded breaks in DNA. HDR requiresnucleotide sequence homology and uses a “donor template” (donor templateDNA, polynucleotide donor, or oligonucleotide (used interchangablyherein) to repair the sequence where the double-stranded break occurred(e.g., DNA target sequence). This results in the transfer of geneticinformation from, for example, the donor template DNA to the DNA targetsequence. HDR may result in alteration of the DNA target sequence (e.g.,insertion, deletion, mutation) if the donor template DNA sequence oroligonucleotide sequence differs from the DNA target sequence and partor all of the donor template DNA polynucleotide or oligonucleotide isincorporated into the DNA target sequence. In some embodiments, anentire donor template DNA polynucleotide, a portion of the donortemplate DNA polynucleotide, or a copy of the donor polynucleotide isintegrated at the site of the DNA target sequence.

As used herein the term “non-homologous end joining” or “NHEJ” refers tothe repair of double-stranded breaks in DNA by direct ligation of oneend of the break to the other end of the break without a requirement fora donor template DNA. NHEJ in the absence of a donor template DNA oftenresults in a small number of nucleotides randomly inserted or deleted atthe site of the double-stranded break.

Alternative mechanisms of DNA insertion that do not require sequencehomology between the donor and the target sequence can also be used fornucleic acid insertion. These mechanisms involve various components ofthe cellular DNA repair machinery and it is to be understood that thescope of the invention is not bound by the use of any particularmechanism for insertion of nucleic acid after target nucleic acid is cutor nicked by a site-specific polynucleotide.

The terms “vector” and “plasmid” are used interchangeably and as usedherein refer to a polynucleotide vehicle to introduce genetic materialinto a cell. Vectors can be linear or circular. Vectors can integrateinto a target genome of a host cell or replicate independently in a hostcell. Vectors can comprise, for example, an origin of replication, amulticloning site, and/or a selectable marker. An expression vectortypically comprises an expression cassette. Vectors and plasmidsinclude, but are not limited to, integrating vectors, prokaryoticplasmids, eukaryotic plasmids, plant synthetic chromosomes, episomes,viral vectors, cosmids, and artificial chromosomes. As used herein theterm “expression cassette” is a polynucleotide construct, generatedrecombinantly or synthetically, comprising regulatory sequences operablylinked to a selected polynucleotide to facilitate expression of theselected polynucleotide in a host cell. For example, the regulatorysequences can facilitate transcription of the selected polynucleotide ina host cell, or transcription and translation of the selectedpolynucleotide in a host cell. An expression cassette can, for example,be integrated in the genome of a host cell or be present in anexpression vector.

As used herein the term “expression cassette” is a polynucleotideconstruct, generated recombinantly or synthetically, comprisingregulatory sequences operably linked to a selected polynucleotide tofacilitate expression of the selected polynucleotide in a host cell. Forexample, the regulatory sequences can facilitate transcription of theselected polynucleotide in a host cell, or transcription and translationof the selected polynucleotide in a host cell. An expression cassettecan, for example, be integrated in the genome of a host cell or bepresent in an expression vector.

As used herein a “targeting vector” is a recombinant DNA constructtypically comprising tailored DNA arms homologous to genomic DNA thatflanks critical elements of a target gene or target sequence. Whenintroduced into a cell, the targeting vector integrates into the cellgenome via homologous recombination. Elements of the target gene can bemodified in a number of ways including deletions and/or insertions. Adefective target gene can be replaced by a functional target gene, or inthe alternative a functional gene can be knocked out. Optionally atargeting vector comprises a selection cassette comprising a selectablemarker that is introduced into the target gene. Targeting regionsadjacent or sometimes within a target gene can be used to affectregulation of gene expression.

As used herein, the terms “regulatory sequences,” “regulatory elements,”and “control elements” are interchangeable and refer to polynucleotidesequences that are upstream (5′ non-coding sequences), within, ordownstream (3′ non-translated sequences) of a polynucleotide target tobe expressed. Regulatory sequences influence, for example, the timing oftranscription, amount or level of transcription, RNA processing orstability, and/or translation of the related structural nucleotidesequence. Regulatory sequences may include activator binding sequences,enhancers, introns, polyadenylation recognition sequences, promoters,repressor binding sequences, stem-loop structures, translationalinitiation sequences, translation leader sequences, transcriptiontermination sequences, translation termination sequences, primer bindingsites, and the like.

As used herein the term “operably linked” refers to polynucleotidesequences or amino acid sequences placed into a functional relationshipwith one another. For instance, a promoter or enhancer is operablylinked to a coding sequence if it regulates, or contributes to themodulation of, the transcription of the coding sequence. Operably linkedDNA sequences encoding regulatory sequences are typically contiguous tothe coding sequence. However, enhancers can function when separated froma promoter by up to several kilobases or more. Accordingly, somepolynucleotide elements may be operably linked but not contiguous.

As used herein, the term “expression” refers to transcription of apolynucleotide from a DNA template, resulting in, for example, an mRNAor other RNA transcript (e.g., non-coding, such as structural orscaffolding RNAs). The term further refers to the process through whichtranscribed mRNA is translated into peptides, polypeptides, or proteins.Transcripts and encoded polypeptides may be referred to collectively as“gene product.” Expression may include splicing the mRNA in a eukaryoticcell, if the polynucleotide is derived from genomic DNA.

As used herein the term “modulate” refers to a change in the quantity,degree or amount of a function. For example, the methods disclosedherein may modulate Cas9-mediated targeting efficiency by decreasing oreliminating off-target cleavage, thereby enhancing cleavage at thetarget site, or may enhance HDR and decrease the likelihood of NHEJevents. Accordingly, the term “modulating targeting” may denoteincreasing desired targeting events and/or inhibiting off-targetcleavage. Similarly, “modulating HDR” can denote increasing HDR and/ordecreasing NHEJ.

Modulation can be assayed by determining any characteristic directly orindirectly affected by the expression of the target gene. Suchcharacteristics include, e.g., changes in targeting efficiency, RNA orprotein levels, protein activity, product levels, associated geneexpression, or activity level of reporter genes. Thus, “modulation” ofgene expression includes both gene activation and gene repression.

As used herein, the term “amino acid” refers to natural and synthetic(unnatural) amino acids, including amino acid analogs, modified aminoacids, peptidomimetics, glycine, and D or L optical isomers.

As used herein, the terms “peptide,” “polypeptide,” and “protein” areinterchangeable and refer to polymers of amino acids. A polypeptide maybe of any length. It may be branched or linear, it may be interrupted bynon-amino acids, and it may comprise modified amino acids. The terms maybe used to refer to an amino acid polymer that has been modifiedthrough, for example, acetylation, disulfide bond formation,glycosylation, lipidation, phosphorylation, cross-linking, and/orconjugation (e.g., with a labeling component or ligand). Polypeptidesequences are displayed herein in the conventional N-terminal toC-terminal orientation.

Polypeptides and polynucleotides can be made using routine techniques inthe field of molecular biology (see, e.g., standard texts discussedabove). Further, essentially any polypeptide or polynucleotide can becustom ordered from commercial sources.

The term “binding” as used herein includes a non-covalent interactionbetween macromolecules (e.g., between a protein and a polynucleotide,between a polynucleotide and a polynucleotide, and between a protein anda protein). Such non-covalent interaction is also referred to as“associating” or “interacting” (e.g., when a first macromoleculeinteracts with a second macromolecule, the first macromolecule binds tosecond macromolecule in a non-covalent manner). Some portions of abinding interaction may be sequence-specific; however, all components ofa binding interaction do not need to be sequence-specific, such as aprotein's contacts with phosphate residues in a DNA backbone. Bindinginteractions can be characterized by a dissociation constant (Kd).“Affinity” refers to the strength of binding. An increased bindingaffinity is correlated with a lower Kd. An example of non-covalentbinding is hydrogen bond formation between base pairs.

As used herein, the term “isolated” can refer to a nucleic acid orpolypeptide that, by the hand of a human, exists apart from its nativeenvironment and is therefore not a product of nature. Isolated meanssubstantially pure. An isolated nucleic acid or polypeptide can exist ina purified form and/or can exist in a non-native environment such as,for example, in a recombinant cell.

As used herein, a “host cell” generally refers to a biological cell. Acell can be the basic structural, functional and/or biological unit of aliving organism. A cell can originate from any organism having one ormore cells. Examples of host cells include, but are not limited to: aprokaryotic cell, eukaryotic cell, a bacterial cell, an archaeal cell, acell of a single-cell eukaryotic organism, a protozoa cell, a cell froma plant (e.g. cells from plant crops, fruits, vegetables, grains, soybean, corn, maize, wheat, seeds, tomatoes, rice, cassava, sugarcane,sunflower, sorghum, millet, alfalfa, oil-producing Brassica (forexample, but not limited to, oilseed rape/canola), pumpkin, hay,potatoes, cotton, cannabis, tobacco, flowering plants, conifers,gymnosperms, ferns, clubmosses, hornworts, liverworts, mosses), an algalcell, (e.g., Botryococcus braunii, Chlamydomonas reinhardtii,Nannochloropsis gaditana, Chlorella pyrenoidosa, Sargassum patens C.Agardh, and the like), seaweeds (e.g. kelp), a fungal cell (e.g., ayeast cell, a cell from a mushroom), an animal cell, a cell from aninvertebrate animal (e.g fruit fly, cnidarian, echinoderm, nematode,etc.), a cell from a vertebrate animal (e.g., fish, amphibian, reptile,bird, mammal), a cell from a mammal (e.g., a pig, a cow, a goat, asheep, a rodent, a rat, a mouse, a non-human primate, a human, etc.).Further, a cell can be a stem cell or progenitor cell.

As used herein, the term “transgenic organism” refers to an organismcomprising a recombinantly introduced polynucleotide.

As used herein, the terms “transgenic plant cell” and “transgenic plant”are interchangeable and refer to a plant cell or a plant containing arecombinantly introduced polynucleotide. Included in the term transgenicplant is the progeny (any generation) of a transgenic plant or a seedsuch that the progeny or seed comprises a DNA sequence encoding arecombinantly introduced polynucleotide or a fragment thereof.

As used herein, the phrase “generating a transgenic plant cell or aplant” refers to using recombinant DNA methods and techniques toconstruct a vector for plant transformation to transform the plant cellor the plant and to generate the transgenic plant cell or the transgenicplant.

A CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) isa genomic locus found in the genomes of many prokaryotes (e.g., bacteriaand archaea). CRISPR loci provide resistance to foreign invaders (e.g.,virus, phage) in prokaryotes. In this way, the CRISPR system can bethought to function as a type of immune system to help defendprokaryotes against foreign invaders. There are three stages of CRISPRlocus function: integration of new sequences into the locus, biogenesisof CRISPR RNA (crRNA), and silencing of foreign invader nucleic acid.

A CRISPR locus includes a number of short repeating sequences referredto as “repeats.” Repeats can form hairpin structures and/or repeats canbe unstructured single-stranded sequences. The repeats occur inclusters. Repeats frequently diverge between species. Repeats areregularly interspaced with unique intervening sequences, referred to as“spacers,” resulting in a repeat-spacer-repeat locus architecture.Spacers are identical to or have high homology with known foreigninvader sequences. A spacer-repeat unit encodes a crisprRNA (crRNA). AcrRNA refers to the mature form of the spacer-repeat unit. A crRNAcomprises a “seed” sequence that is involved in targeting a targetnucleic acid (e.g., possibly as a surveillance mechanism against foreignnucleic acid). A seed sequence is typically located towards the 5′ endof a crRNA (e.g. in the Cascade complex; for a description of theCascade complex see, e.g., Jore, M. M. et al., “Structural basis forCRISPR RNA-guided DNA recognition by Cascade,” Nature Structural &Molecular Biology (2011) 18:529-536) or at the 3′ end of the spacer of acrRNA (e.g., in a Type II CRISPR-Cas9 system), directly adjacent to thefirst stem.

A CRISPR locus comprises polynucleotide sequences encoding for CRISPRAssociated Genes (Cas) genes. Cas genes are involved in the biogenesisand/or the interference stages of crRNA function. Cas genes displayextreme sequence (e.g., primary sequence) divergence between species andhomologues. For example, Casl homologues can comprise less than 10%primary sequence identity between homologues. Some Cas genes comprisehomologous secondary and/or tertiary structures. For example, despiteextreme sequence divergence, many members of the Cas6 family of CRISPRproteins comprise a N-terminal ferredoxin-like fold. Cas genes are namedaccording to the organism from which they are derived. For example, Casgenes in Staphylococcus epidermidis can be referred to as Csm-type, Casgenes in Streptococcus thermophilus can be referred to as Csn-type, andCas genes in Pyrococcus furiosus can be referred to as Cmr-type.

The integration stage of a CRISPR system refers to the ability of theCRISPR locus to integrate new spacers into the crRNA array upon beinginfected by a foreign invader. Acquisition of the foreign invaderspacers can help confer immunity to subsequent attacks by the sameforeign invader. Integration typically occurs at the leader end of theCRISPR locus. Cas proteins (e.g., Cas1 and Cas2) are involved inintegration of new spacer sequences. Integration proceeds similarly forsome types of CRISPR systems (e.g., Type I-III).

Mature crRNAs are processed from a longer polycistronic CRISPR locustranscript (i.e., pre-crRNA array). A pre-crRNA array comprises aplurality of crRNAs. The repeats in the pre-crRNA array are recognizedby Cas genes. Cas genes bind to the repeats and cleave the repeats. Thisaction can liberate the plurality of crRNAs. crRNAs can be subjected tofurther events to produce the mature crRNA form such as trimming (e.g.,with an exonuclease). A crRNA may comprise all, some, or none of theCRISPR repeat sequence.

Interference refers to the stage in the CRISPR system that isfunctionally responsible for combating infection by a foreign invader.CRISPR interference follows a similar mechanism to RNA interference(RNAi: e.g., wherein a target RNA is targeted (e.g., hybridized) by ashort interfering RNA (siRNA)), which results in target RNA degradationand/or destabilization. CRISPR systems perform interference of a targetnucleic acid by coupling crRNAs and Cas genes, thereby forming CRISPRribonucleoproteins (crRNPs). crRNA of the crRNP guides the crRNP toforeign invader nucleic acid, (e.g., by recognizing the foreign invadernucleic acid through hybridization). Hybridized target foreign invadernucleic acid-crRNA units are subjected to cleavage by Cas proteins.Target nucleic acid interference typically requires a protospaceradjacent motif (PAM) in a target nucleic acid.

There are at least four types of CRISPR systems: Type I, Type II, TypeIII, and Type U. More than one CRISPR type system can be found in anorganism. CRISPR systems can be complementary to each other, and/or canlend functional units in trans to facilitate CRISPR locus processing.Type II systems can be further subdivided into II-A (contains Csn2locus) and II-B (contains Cas4 locus) and Type II-C (neither Csn2 norCas4, e.g. N. meningitides). Modifications of the components ofCRISPR-Type II systems are extensively discussed in the presentspecification.

crRNA biogenesis in a Type II CRISPR system comprises a trans-activatingCRISPR RNA (tracrRNA). A tracrRNA is typically modified by endogenousRNaseIII. The tracrRNA hybridizes to a crRNA repeat in the pre-crRNAarray. Endogenous RNaselll is recruited to cleave the pre-crRNA. CleavedcrRNAs are subjected to exoribonuclease trimming to produce the maturecrRNA form (e.g., 5′ trimming). The tracrRNA typically remainshybridized to the crRNA. The tracrRNA and the crRNA associate with asite-directed polypeptide (e.g., Cas9). The crRNA of thecrRNA-tracrRNA-Cas9 complex can guide the complex to a target nucleicacid to which the crRNA can hybridize. Hybridization of the crRNA to thetarget nucleic acid activates a wild-type, cognate Cas9 for targetnucleic acid cleavage. Target nucleic acid in a Type II CRISPR systemcomprises a PAM. In some embodiments, a PAM is essential to facilitatebinding of a site-directed polypeptide (e.g., Cas9) to a target nucleicacid.

Cas9 is an exemplary Type II CRISPR Cas protein. Cas9 is an endonucleasethat can be programmed by the tracrRNA/crRNA to cleave,site-specifically, target DNA using two distinct endonuclease domains(HNH and RuvC/RNase H-like domains) (see U.S. Published PatentApplication No. 2014-0068797, published 6 Mar. 2014; see also Jinek M.,et al., “A programmable dual-RNA-guided DNA endonuclease in adaptivebacterial immunity,” Science (2012) 337:816-821), one for each strand ofthe DNA's double helix. RuvC and HNH together produce double-strandedbreaks (DSBs), and separately can produce single-stranded breaks. FIG.3A presents a model of the domain arrangement of SpyCas9 (S. pyogenesCas9) relative to its primary sequence structure. Two RNA components ofa Type II CRISPR-Cas9 system are illustrated in FIG. 1A. Typically eachCRISPR-Cas9 system comprises a tracrRNA and a crRNA. However, thisrequirement can be bypassed by using an engineered sgRNA, described morefully below, containing a designed hairpin that mimics thetracrRNA-crRNA complex (Jinek et al., 2012). Base-pairing between thesgRNA and target DNA causes double-stranded breaks (DSBs) due to theendonuclease activity of Cas9. Binding specificity is determined by bothsgRNA-DNA base pairing and a short DNA motif (protospacer adjacent motif[PAM] sequence: NGG) juxtaposed to the DNA complementary region(Marraffini L A, Sontheimer E J. “CRISPR interference: RNA-directedadaptive immunity in bacteria and archaea,” Nat Rev Genet., 2010;11:181-190). Thus, the CRISPR system only requires a minimal set of twomolecules—the Cas9 protein and the sgRNA.

A large number of Cas9 orthologs are known in the art as well as theirassociated tracrRNA and crRNA components (see, e.g., “SupplementaryTable S2. List of bacterial strains with identified Cas9 orthologs,”Fonfara, Ines, et al., “Phylogeny of Cas9 Determines FunctionalExchangeability of Dual-RNA and Cas9 among Orthologous Type IICRISPR/Cas Systems,” Nucleic Acids Research (2014) 42:2577-2590,including all Supplemental Data; Chylinski K., et al., “Classificationand evolution of type II CRISPR-Cas systems,” Nucleic Acids Research(2014) 42:6091-6105, including all Supplemental Data.); Esvelt, K. M.,et al., “Orthogonal Cas9 proteins for RNA-guided gene regulation andediting,” Nature Methods (2013) 10:1116-1121). A number of orthogonalCas9 proteins have been identified including Cas9 proteins fromNeisseria meningitidis, Streptococcus thermophilus and Staphylococcusaureus.

As used herein, “a Cas9 protein” refers to a Cas9 protein derived fromany species, subspecies or strain of bacteria that encodes Cas9, as wellas variants and orthologs of the particular Cas9 in question. The Cas9proteins can either be directly isolated and purified from bacteria, orsynthetically or recombinantly produced, or typically delivered using aconstruct encoding the protein, including without limitation, naked DNA,plasmid DNA, a viral vector and mRNA for Cas9 expression.

Variants and modifications of Cas9 proteins are known in the art. U.S.Published Patent Application 20140273226, published Sep 18, 2014,incorporated herein by reference in its entirety, discusses the S.pyogenes Cas9 gene, Cas9 protein, and variants of the Cas9 proteinincluding host-specific codon optimized Cas9 coding sequences (e.g.,¶¶0129-0137 therein) and Cas9 fusion proteins (e.g., ¶¶233-240 therein).U.S. Published Patent Application 20140315985, published Oct. 23, 2014,incorporated herein in its entirety, teaches a large number of exemplarywild-type Cas9 polypeptides (e.g., SEQ ID NO: 1-256, SEQ ID NOS:795-1346, therein) including the sequence of Cas9 from S. pyogenes (SEQID NO: 8, therein). Modifications and variants of Cas9 proteins are alsodiscussed (e.g., ¶¶504-608, therein). Non-limiting examples of Cas9proteins include Cas9 proteins from S. pyogenes (GI:15675041); Listeriainnocua Clip 11262 (GI:16801805); Streptococcus mutans UA159(GI:24379809); Streptococcus thermophilus LMD-9 (S. thermophilus A,GI:11662823; S. thermophilus B, GI:116627542); Lactobacillus buchneriNRRL B-30929 (GI:331702228); Treponema denticola ATCC 35405(GI:42525843); Francisella novicida U112 (GI:118497352); Campylobacterjejuni subsp. Jejuni NCTC 11168 (GI:218563121); Pasteurella multocidasubsp. multocida str. Pm70 (GI:218767588); Neisseria meningitidis Zs491(GI:15602992) and Actinomyces naeslundii (GI:489880078).

Aspects of the present invention can be practiced by one of ordinaryskill in the art following the guidance of the specification to use TypeII CRISPR-Cas9 proteins and Cas-protein encoding polynucleotides,including, but not limited to Cas9, Cas9-like, proteins encoded by Cas9orthologs, Cas9-like synthetic proteins, and variants and modificationsthereof. The cognate RNA components of these Cas proteins can bemanipulated and modified for use in the practice of the presentinvention by one of ordinary skill in the art following the guidance ofthe present specification.

By “dCas9” is meant a nuclease-deactivated Cas9, also termed“catalytically inactive”, “catalytically dead Cas9” or “dead Cas9.” Suchmolecules lack all or a portion of endonuclease activity and cantherefore be used to regulate genes in an RNA-guided manner (Jinek M.,et al., “A programmable dual-RNA-guided DNA endonuclease in adaptivebacterial immunity,” Science (2012) 337:816-821). This is accomplishedby introducing mutations that inactivate Cas9 nuclease function and istypically accomplished by mutating both of the two catalytic residues(D10A in the RuvC-1 domain, and H840A in the HNH domain, numberedrelative to S. pyogenes Cas9) of the gene encoding Cas9. It isunderstood that mutation of other catalytic residues to reduce activityof either or both of the nuclease domains can also be carried out by oneskilled in the art. In doing so, dCas9 is unable to cleave dsDNA butretains the ability to target DNA. The Cas9 double mutant with changesat amino acid positions D10A and H840A completely inactivates both thenuclease and nickase activities. Targeting specificity is determined bycomplementary base-pairing of an sgRNA to the genomic locus and theprotospacer adjacent motif (PAM).

dCas9 can be used alone or in fusions to synthetically repress (CRISPRi)or activate (CRISPRa) gene expression. CRISPRi can work independently ofhost cellular machineries. In some embodiments, only a dCas9 protein anda customized sgRNA designed with a complementary region to any gene ofinterest direct dCas9 to a chosen genomic location. In otherembodiments, dCas9 can be fused to a transcription factor, such as arepressor, and the fused Cas9-transcription factor can then work inconcert with cellular machineries. The binding specificity is determinedjointly by the complementary region on the sgRNA and a short DNA motif(protospacer adjacent motif or PAM) juxtaposed to the DNA complementaryregion, dependent on the species in question. (see, e.g., Anders C., etal., “Structural basis of PAM-dependent target DNA recognition by theCas9 endonuclease,” Nature (2014) 513:569-573). In the case of S.pyogenes, this sequence is NGG. To achieve transcriptional repression,dCas9 can be used by itself (whereby it represses transcription throughsteric hindrance). Taken together sgRNA and dCas9 provide a minimumsystem for gene-specific regulation in any organism. (Qi, L. S., et al.,“Repurposing CRISPR as an RNA-Guided Platform for Sequence-SpecificControl of Gene Expression” Cell (2013) 152:1173-1183). CRISPRa iscarried out by dCas9-transcription factor (activator) fusions.

By a “Cas9 nickase” is meant a Cas9 mutant that does not retain theability to make double-stranded breaks in a target nucleic acidsequence, but maintains the ability to bind to and make asingle-stranded break at a target site. Such a mutant will typicallyinclude a mutation in one, but not both of the Cas9 endonuclease domains(HNH and RuvC). Thus, an amino acid mutation at position D10A or H840Ain Cas9, numbered relative to S. pyogenes, can result in theinactivation of the nuclease catalytic activity and convert Cas9 to anickase enzyme that makes single-stranded breaks at the target site. Itis to be understood that other site-directed polypeptides such asmeganucleases, TALE nucleases, Zinc-finger nucleases, MEGA-TALs andothers known to one of skill in the art can be used in alternativeembodiments.

crRNA has a region of complementarity to a potential DNA target sequence(FIG. 1A, the dark, 5′ region of the crRNA) and a second region thatforms base-pair hydrogen bonds with the tracrRNA to form a secondarystructure, typically to form at least a stem structure (FIG. 1A, thelight region extending to the 3′ end of the crRNA). The region ofcomplementarity to the DNA target is the spacer. The tracrRNA and acrRNA interact through a number of base-pair hydrogen bonds to formsecondary RNA structures, for example, as illustrated in FIG. 1B.Complex formation between tracrRNA/crRNA and Cas protein results inconformational change of the Cas protein that facilitates binding toDNA, endonuclease activities of the Cas protein, and crRNA-guidedsite-specific DNA cleavage by the endonuclease. For a Casprotein/tracrRNA/crRNA complex to cleave a DNA target sequence, the DNAtarget sequence is adjacent to a cognate protospacer adjacent motif(PAM).

The term “sgRNA” typically refers to a single-guide RNA (i.e., a single,contiguous polynucleotide sequence) that essentially comprises a crRNAconnected at its 3′ end to the 5′ end of a tracrRNA through a “loop”sequence (see, e.g., U.S. Published Patent Application No. 20140068797,published 6 Mar. 2014, incorporated herein by reference in itsentirety). sgRNA interacts with a cognate Cas protein essentially asdescribed for tracrRNA/crRNA polynucleotides, as discussed above.Similar to crRNA, sgRNA has a spacer, a region of complementarity to apotential DNA target sequence (FIG. 2, 201), adjacent a second regionthat forms base-pair hydrogen bonds that form a secondary structure,typically a stem structure (FIG. 2, 202, 203, 204, 205). The termincludes truncated single-guide RNAs (tru-sgRNAs) of approximately 17-18nt. (See, e.g., Fu, Y. et. al., “Improving CRISPR-Cas nucleasespecificity using truncated guide RNAs,” Nat Biotechnol. (2014)32:279-284). The term also encompasses functional miniature sgRNAs withexpendable features removed, but that retain an essential and conservedmodule termed the “nexus” located in the portion of sgRNA thatcorresponds to tracrRNA (not crRNA). See, e.g, U.S. Published PatentApplication No. 20140315985, published 23 Oct. 2014, incorporated hereinby reference in its entirety; Briner et al., “Guide RNA FunctionalModules Direct Cas9 Activity and Orthogonality,” Molecular Cell (2014)56:333-339. The nexus is located immediately downstream of (i.e.,located in the 3′ direction from) the lower stem in Type II CRISPR-Cas9systems. An example of the relative location of the nexus is illustratedin the sgRNA shown in FIG. 2. The nexus confers the binding of a sgRNAor a tracrRNA to its cognate Cas9 protein and confers an apoenzyme tohaloenzyme conformational transition.

With reference to a crRNA or sgRNA, a “spacer” or “spacer element” asused herein refers to the polynucleotide sequence that can specificallyhybridize to a target nucleic acid sequence. The spacer elementinteracts with the target nucleic acid sequence through hydrogen bondingbetween complementary base pairs (i.e., paired bases). A spacer elementbinds to a selected DNA target sequence. Accordingly, the spacer elementis a DNA target-binding sequence. The spacer element determines thelocation of Cas protein's site-specific binding and endonucleolyticcleavage. Spacer elements range from ˜17- to ˜84 nucleotides in length,depending on the Cas protein with which they are associated, and have anaverage length of 36 nucleotides (Marraffini, et al., “CRISPRinterference: RNA-directed adaptive immunity in bacteria and archaea,”Nature reviews Genetics (2010) 11:181-190). In a Type II CRISPR-Cas9system the spacer element typically comprises a “seed” sequence that isinvolved in targeting a target nucleic acid. For example, for SpyCas9,the functional length for a spacer to direct specific cleavage istypically about 12-25 nucleotides. Variability of the functional lengthfor a spacer element is known in the art (e.g., U.S. Published PatentApplication No. 20140315985, published 23 Oct. 2014, incorporated hereinby reference in its entirety).

FIG. 3A provides a three-dimensional model based on the crystalstructure of S. pyogenes Cas9 (SpyCas9) in an active complex with sgRNA.The relationship of the sgRNA to the Helical domain and the Catalyticdomain is illustrated. The 3′ and 5′ ends of the sgRNA are indicated, aswell as exposed portions of the sgRNA. The spacer RNA of the sgRNA isnot visible because it is surrounded by the α-Helical lobe (Helicaldomain) and the Catalytic nuclease lobe (Catalytic domain). The spacerRNA of the sgRNA is located in the 5′ end region of the sgRNA. The RuvCand HNH nuclease domains, when active, each cut a different DNA strandin target DNA. The C-terminal domain (CTD) is involved in recognition ofprotospacer adjacent motifs (PAMs) in target DNA.

U.S. Published Patent Application No. 20140315985, published 23 Oct.2014, incorporated herein by reference in its entirety; and Briner etal., “Guide RNA Functional Modules Direct Cas9 Activity andOrthogonality,” Molecular Cell (2014) 56:333-339, disclose consensussequences and secondary structures of predicted sgRNAs for severalsgRNA/Cas9 families. The general arrangement of secondary structures inthe predicted sgRNAs up to and including the nexus are presented in FIG.2 herein which presents an overview of and nomenclature for elements ofthe sgRNA of the S. pyogenes Cas9. Relative to FIG. 2, there isvariation in the number and arrangement of stem structures located 3′ ofthe nexus in the sgRNAs of U.S. Published Patent Application No.2014-0315985 and Briner, et al. Ran et al., “In vivo genome editingusing Staphylococcus aureus Cas9,” Nature (2015) 520:186-191, includingall extended data) present the crRNA/tracrRNA sequences and secondarystructures of eight Type II CRISPR-Cas9 systems (see Extended Data FIG.1 of Ran, et al.). Further, Fonfara, et al., (“Phylogeny of Cas9Determines Functional Exchangeability of Dual-RNA and Cas9 amongOrthologous Type II CRISPR/Cas Systems,” Nucleic Acids Research (2014)42:2577-2590, including all Supplemental Data, in particularSupplemental Figure S11) present the crRNA/tracrRNA sequences andsecondary structures of eight Type II CRISPR-Cas9 systems.

By “guide polynucleotide” is meant any polynucleotide thatsite-specifically guides Cas9 or dCas9 to a target, or off-target,nucleic acid. Many such guide polynucleotides are known, such as but notlimited to sgRNA (including miniature and truncated sgRNAs), dual-guideRNA, including but not limited to, crRNA/tracrRNA molecules, asdescribed above, and the like.

By “donor polynucleotide” is meant a polynucleotide that can be directedto, and inserted into a target site of interest to modify the targetnucleic acid. All or a portion of the donor polynucleotide can beinserted into the target nucleic acid. The donor polynucleotide is usedfor repair of the break in the target DNA sequence resulting in thetransfer of genetic information (i.e., polynucleotide sequences) fromthe donor at the site or in close proximity of the break in the DNA.Accordingly, new genetic information (i.e., polynucleotide sequences)may be inserted or copied at a target DNA site. The donor polynucleotidecan be double- or single-stranded DNA, RNA, a vector, plasmid, or thelike. Non-symmetrical polynucleotide donors can also be used that arecomposed of two DNA oligonucleotides. They are partially complementary,and each can include a flanking region of homology. The donor can beused to insert or replace polynucleotide sequences in a target sequence,for example, to introduce a polynucleotide that encodes a protein orfunctional RNA (e.g., siRNA), to introduce a protein tag, to modify aregulatory sequence of a gene, or to introduce a regulatory sequence toa gene (e.g. a promoter, an enhancer, an internal ribosome entrysequence, a start codon, a stop codon, a localization signal, orpolyadenylation signal), to modify a nucleic acid sequence (e.g.,introduce a mutation), and the like.

Targeted DNA modifications using donor polynucleotides for large changes(e.g., more than 100 bp insertions or deletions) traditionally useplasmid-based donor templates that contain homology arms flanking thesite of alteration. Each arm can vary in length, but is typically longerthan about 100 bp, such as 100-1500 bp, e.g., 100 . . . 200 . . . 300 .. . 400 . . . 500 . . . 600 . . . 700 . . . 800 . . . 900 . . . 1000 . .. 1500 bp or any integer between these values. However, these numberscan vary, depending on the size of the donor polynucleotide and thetarget polynucleotide. This method can be used to generate largemodifications, including insertion of reporter genes such as fluorescentproteins or antibiotic resistance markers. For transfection in cells,such as HEK cells, approximately 100-1000 ng, e.g., 100 . . . 200 . . .300 . . . 400 . . . 500 . . . 600 . . . 700 . . . 800 . . . 900 . . .1000 ng or any integer between these values, of a typical size donorplasmid (e.g., approximately 5 kb) containing a sgRNA/Cas9 vector, canbe used for one well in 24-well plate. (See, e.g., Yang et al., “OneStep Generation of Mice Carrying Reporter and Conditional Alleles byCRISPR/Cas-Mediated Genome Engineering” Cell (2013) 154:1370-1379).

Single-stranded and partially double-stranded oligonucleotides, such asDNA oligonucleotides, have been used in place of targeting plasmids forshort modifications (e.g., less than 50 bp) within a defined locuswithout cloning. To achieve high HDR efficiencies, single-strandedoligonucleotides containing flanking sequences on each side that arehomologous to the target region can be used, and can be oriented ineither the sense or antisense direction relative to the target locus.The length of each arm can vary in length, but the length of at leastone arm is typically longer than about 10 bases, such as from 10-150bases, e.g., 10 . . . 20 . . . 30 . . . 40 . . . 50 . . . 60 . . . 70 .. . 80 . . . 90 . . . 100 . . . 110 . . . 120 . . . 130 . . . 140 . . .150, or any integer within these ranges. However, these numbers canvary, depending on the size of the donor polynucleotide and the targetpolynucleotide. In a preferred embodiment, the length of at least onearm is 10 bases or more. In other embodiments, the length of at leastone arm is 20 bases or more. In yet other embodiments, the length of atleast one arm is 30 bases or more. In some embodiments, the length of atleast one arm is less than 100 bases. In further embodiments, the lengthof at least one arm is greater than 100 bases. In some embodiments, thelength of at least one arm is zero bases. For single-stranded DNAoligonucleotide design, typically an oligonucleotide with around 100-150bp total homology is used. The mutation is introduced in the middle,giving 50-75 bp homology arms for a donor designed to be symmetricalabout the target site. In other cases, no homology arms are required,and the donor polynucleotide is inserted using non-homologous DNA repairmechanisms.

In one embodiment, the methods described herein are useful forincreasing Cas9-mediated engineering efficiency by modulating off-targetgenome editing events, e.g., by decreasing the number of double-strandedbreaks in DNA in unintended and/or incorrect locations. In particular,genome engineering systems, such as those using zinc-finger nucleases(ZFNs), TALE-nucleases, and bacterially derived RNA-guided nucleases(e.g., the CRISPR-Cas9 system), have been used to target a protein to aspecific genomic locus where it can induce a DNA double-stranded break.DNA double-stranded breaks can be repaired through either non homologousend joining (NHEJ) or homology-directed repair (HDR). NHEJ can result inimperfect repair and the addition or deletion of several bases, whereasHDR can be utilized to insert rationally designed exogenous DNAsequences. These methods can sometimes result in off-target nucleaseactivity as described above.

Methods for increasing specificity and/or reducing off-target genomicevents have included the use of shorter guide sequences with enhancedspecificity (Fu, Y. et. al., “Improving CRISPR-Cas nuclease specificityusing truncated guide RNAs,” Nat Biotechnol. (2014) 32:279-284) and/orengineering Cas9 mutants that can use two independent targeting eventsto induce a double-stranded break (Ran, F. A, et al., “Double nicking byRNA-guided CRISPR Cas9 for enhanced genome editing specificity,” Cell(2013) 154:1380-1389; Tsai, S. Q., et al., “Dimeric CRISPR RNA-guidedFokI nucleases for highly specific genome editing,” Nature Biotech.(2014) 32:569-576). However, these strategies may reduce the efficiencyof on-target genome editing, constrain targeting capabilities, or stillresult in “off-target” nuclease activity.

Accordingly, an embodiment of the present invention provides methods tomitigate off-target genome editing events in a cell population or in anin vitro biochemical reaction. Mitigation of such events can beperformed by an engineered CRISPR-Cas9 system as described herein. Themethods include at least two basic components: (1) a complex of acatalytically active Cas9 protein and a sgRNA that targets the intendednucleic acid target (sgRNA/Cas9 complex); and (2) a complex of acatalytically inactive Cas9 protein, termed “dCas9” herein and a sgRNAthat targets off-target loci (sgRNA/dCas9 complex). In some embodiments,rather than a sgRNA/Cas9 complex, the first component can be anysite-directed catalytically active DNA endonuclease, such as but notlimited to zinc-finger nucleases (ZFNs), TALE-nucleases, and the like.

An off-target nucleic acid can differ from a target nucleic acid by,e.g., at least 1-5, such as 1, 2, 3, 4, 5 nucleotides, or up to 10 ormore nucleotides or any number of nucleotides within the stated ranges.

The percent complementarity between an off-target nucleic acid locus (orsurrounding genomic region) and an “on-target” nucleic acid-targetingnucleic acid can be, for example about 5% to about 100%, or anypercentage between this range, more preferably in the range of 90-100%.

A number of catalytically active Cas9 proteins are known in the art and,as explained above, a Cas9 protein for use herein can be derived fromany bacterial species, subspecies or strain that encodes the same.Although the subject invention is exemplified using S. pyogenes Cas9,orthologs from other bacterial species will find use herein. Thespecificity of these Cas9 orthologs is well known. Also useful areproteins encoded by Cas9-like synthetic proteins, and variants andmodifications thereof. As explained above, the sequences for hundreds ofCas9 proteins are known and any of these proteins will find use with thepresent methods. The appropriate Cas9 protein to use with a particulartarget nucleic acid can be readily determined by one of skill in theart.

dCas9 proteins are also known and, as described above, these proteinscan be made catalytically inactive by mutating the RuvC1 and/or HNHdomains to eliminate nuclease function. This is typically accomplishedby introducing point mutations in both of the two catalytic residues(D10A and H840A, numbered relative to S. pyogenes Cas9) of the geneencoding Cas9. In doing so, dCas9 is rendered unable to cleavedouble-stranded DNA but retains the ability to target DNA. Moreover, aswith the Cas9 proteins, the dCas9 proteins can be derived from anybacterial species, subspecies or strain that encodes the same. Alsouseful are proteins encoded by Cas9 orthologs, Cas9-like syntheticproteins, and variants and modifications thereof. In one embodiment,dCas9 orthologs are selected based on the particular protospaceradjacent motif (PAM) sequences present on the target nucleic acid. Forexample, S. pyogenes Cas9 targets NGG sequences. However, if other PAMsequences are present, dCas9 orthologs can be used to target thesesequences to block Cas9 cleavage thereof and prevent off-target breaks.

In the following embodiments, sgRNA is used as an exemplary guidepolynucleotide, however, it will be recognized by one of skill in theart that other guide polynucleotides that site-specifically guide Cas9or dCas9 to a target, or off-target, nucleic acid can be used. The sgRNAcomponent of the complexes is responsible for targeting a particularnucleic acid target. In particular, the spacer region of the sgRNAincludes the region of complementarity to the targeted nucleic acidsequence. Thus, the spacer is the polynucleotide sequence that canspecifically hybridize to a target nucleic acid sequence. The spacerelement interacts with the target nucleic acid sequence through hydrogenbonding between complementary base pairs. A spacer element binds to aselected nucleic acid target sequence. Accordingly, the spacer elementis the DNA target-binding sequence.

Thus, binding specificity is determined by both sgRNA-DNA base pairingand the PAM sequence juxtaposed to the DNA complementary region.

Thus, in an aspect of the present invention, a sgRNA/dCas9 complex istargeted to genomic loci similarly targeted by catalytically intactsgRNA/Cas9 complexes, and can stably bind DNA and subsequently blockactivity of proteins targeted to those loci. In this way, dCas9 canrobustly impair binding and/or activity of endogenous transcriptionfactors in eukaryotic cells.

In an exemplary embodiment, a sgRNA, complexed with Cas9 (sgRNA/Cas9complex) is directed to a genomic locus of interest to inducedouble-stranded breaks. The binding specificity is determined by bothsgRNA-DNA base pairing and the PAM sequence juxtaposed to the DNAcomplementary region. Computational and/or experimental methods (e.g.,sequencing, in silico DNA alignment methods can be used to ascertainoff-target nuclease activity (e.g., to determine the off-target loci).Such methods are described in detail below. Independently acting dCas9proteins can be designed to target these off-target loci. Theseengineered dCas9 proteins can be deployed as site-specific nuclease“blockers” to obstruct catalytically intact sgRNA/Cas9 binding andnuclease activity.

sgRNA/Cas9 and sgRNA/dCas9 blockers may be introduced, for example intoa cell or tissue, at differing concentrations. For example, sgRNA/Cas9and sgRNA/dCas9 complexes can be introduced ata ratio of 1:2, 1:3, 1:4,1:5, 1:6, 1:7, 1:8, 1:9, 1:10, 10:1, 9:1, 8:1, 7:1, 6:1, 5:1, 4:1, 3:1,or 2:1. Additionally, all of these components, i.e., sgRNA, Cas9, dCas9,etc. may be provided separately, e.g., as separately in vitro assembledcomplexes, using separate DNA or RNA constructs, or together, in asingle construct, or in any combination. Typically, the sgRNA componentswill complex with Cas9 and dCas9 when provided to a cell. Additionally,cell lines such as but not limited to HEK293 cells, are commerciallyavailable that constitutively express S. pyogenes Cas9 as well as S.pyogenes Cas9-GFP fusions. In this instance, cells can be transfectedwithout catalytically active Cas9 as such is provided by the host cell.

sgRNA/Cas9 and sgRNA/dCas9 complexes may be introduced at differing timepoints. For example, sgRNA/Cas9 and sgRNA/dCas9 complexes can beintroduced at least 1 minute apart, 5 minutes apart, 10 minutes apart,30 minutes apart, 1 hour apart, 5 hours apart, or 15 hours apart ormore. sgRNA/Cas9 and sgRNA/dCas9 complexes can be introduced at most 1minute apart, 5 minutes apart, 10 minutes apart, 30 minutes apart, 1hour apart, 5 hours apart, or 15 hours apart or more. sgRNA/Cas9complexes can be introduced before the sgRNA/dCas9 complexes. sgRNA/Cas9complexes can be introduced after the sgRNA/dCas9 complexes. sgRNA/Cas9complexes and sgRNA/dCas9 complexes may be differentially regulated(i.e. differentially expressed or stabilized) via exogenously suppliedagents (e.g. inducible DNA promoters or inducible Cas9 proteins).

sgRNA/Cas9 and sgRNA/dCas9 complexes can be introduced into a cell by avariety of means including transfection, transduction, electroporation,micelles and liposome delivery, lipid nanoparticles, viral delivery,protein injection, and the like, described more fully below.

sgRNA/dCas9 complexes may be directed to genomic loci that partiallyoverlap. For example, these complexes can be directed to loci thatoverlap by at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, or 35 or morenucleotides. These complexes can be directed to loci that overlap by atmost 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, or 35 or more nucleotides.

sgRNA/dCas9 complexes can be directed adjacent to sites of observedoff-target nuclease activity and Cas9 binding. For example, thesecomplexes can be directed to sites that are adjacent to a site ofobserved off-target activity by at least 1, 2, 3, 4, 5, 10, 15, 20, 25,30, or 35 or more nucleotides. Complexes can be directed to sites thatare adjacent to a site of observed off-target activity by at most 1, 2,3, 4, 5, 10, 15, 20, 25, 30, or 35 or more nucleotides.

Multiple sgRNA/dCas9 complexes may be used to “tile” a given locus formaximum nuclease blocking activity. In some instances, at least 1, 2, 3,4, 5, 6, 7, 8, 9, or 10 or more sgRNA/dCas9 complexes are used. In someinstances, at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more complexesare used. The complexes can cover a locus. Complexes can cover at least5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90,95, or 100% of a locus. The complexes can cover at most 5, 10, 15, 20,25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100% of alocus.

The blockers can reduce off-targeting binding of the active complexes byat least 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100%. The blockers canreduce off-targeting binding of the active complexes by at most 10, 20,30, 40, 50, 60, 70, 80, 90, or 100%.

Without wishing to be bound by a particular theory, a sgRNA/dCas9complex can reduce binding of a sgRNA/Cas9 complex to an off-targetnucleic acid by any mechanism. For example, the sgRNA/dCas9 complex cancompete with the catalytically active complex for binding the off-targetnucleic acid. The sgRNA/dCas9 complex can bind to the off-target nucleicacid, thereby creating steric hindrance for the sgRNA/Cas9 complex thatprevents binding of the sgRNA/Cas9 complex to the off-target nucleicacid. The sgRNA/dCas9 complex can displace the sgRNA/Cas9 complex fromthe off-target nucleic acid. The sgRNA/dCas9 complex can inhibit thesgRNA/Cas9 complex from binding the off-target nucleic acid. ThesgRNA/dCas9 complex can block the sgRNA/Cas9 complex from binding theoff-target nucleic acid.

A sgRNA/dCas9 complex can reduce off-target nucleic acid binding,cleavage and/or modification by a sgRNA/Cas9 complex by 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90%, or 100%, or any value within this range.Conversely, a sgRNA/dCas9 complex can increase site-specific binding,and/or modification by 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or100%, or any value within this range.

Computational methods for determining off-target nuclease activity withany of the methods described herein can comprise scanning the genomicsequence of a subject. The genomic sequence can be segmented in silicointo a plurality of nucleic acid sequences. The segmented nucleic acidsequences can be aligned with the nucleic acid-targeting nucleic acidsequence. A sequence search algorithm can determine one or moreoff-target nucleic acid sequences by identifying segmented genomicsequences with alignments comprising a defined number of base-pairmismatches with the nucleic acid-targeting nucleic acid. The number ofbase-pair mismatches between a genomic sequence and a nucleicacid-targeting nucleic acid selected by an algorithm can beuser-defined, for example, the algorithm can be programmed to identifyoff-target sequences with mismatches of up to five base pairs betweenthe genomic sequence and the nucleic acid-targeting nucleic acid. Insilico binding algorithms can be used to calculate binding and/orcleavage efficiency of each predicted off-target nucleic acid sequenceby a site-directed polypeptide using a weighting scheme. These data canbe used to calculate off-target activity for a given nucleicacid-targeting nucleic acid and/or site-directed polypeptide.

Off-target binding activity can be determined by experimental methods.In one non-limiting example, the experimental methods can comprisesequencing a nucleic acid sample contacted by a complex comprising asite-directed polypeptide and a nucleic acid-targeting nucleic acid. Thecontacted nucleic acid sample can be fixed or crosslinked to stabilizethe protein-RNA-DNA complex. The complex comprising the site-directedpolypeptide, the nucleic acid (e.g., target nucleic acid, off-targetnucleic acid), and/or the nucleic acid-targeting nucleic acid can becaptured from the nucleic acid sample with an affinity tag and/orcapture agents. Nucleic acid purification techniques can be used toseparate the target nucleic acid from the complex. Nucleic acidpurification techniques can include spin column separation,precipitation, and electrophoresis. The nucleic acid can be prepared forsequencing analysis by shearing and ligation of adaptors. Preparationfor sequencing analysis can include the generation of sequencinglibraries of the eluted target nucleic acid.

Sequence determination methods can include but are not limited topyrosequencing (for example, as commercialized by 454 Life Sciences,Inc., Branford, Conn.); sequencing by ligation (for example, ascommercialized in the SOLiDTM technology, Life Technology, Inc.,Carlsbad, Calif.); sequencing by synthesis using modified nucleotides(such as commercialized in TruSeq™ and HiSeg™ technology by Illumina,Inc., San Diego, Calif., HeliScopeTM by Helicos Biosciences Corporation,Cambridge, Mass., and PacBio RS by Pacific Biosciences of California,Inc., Menlo Park, Calif.), sequencing by ion detection technologies (IonTorrent, Inc., South San Francisco, Calif.); sequencing of DNA nanoballs(Complete Genomics, Inc., Mountain View, Calif.); nanopore-basedsequencing technologies (for example, as developed by Oxford NanoporeTechnologies, LTD, Oxford, UK), capillary sequencing (e.g, such ascommercialized in MegaBACE by Molecular Dynamics, Inc., Sunnyvale,Calif.), electronic sequencing, single molecule sequencing (e.g., suchas commercialized in SMRT™ technology by Pacific Biosciences, MenloPark, Calif.), droplet microfluidic sequencing, sequencing byhybridization (such as commercialized by Affymetrix, Santa Clara,Calif.), bisulfite sequencing, and other known highly parallelizedsequencing methods.

In some aspects, sequencing is performed by microarray analysis, such asin SNP genotyping by binding. Sequencing analysis can determine theidentity and frequency of an off-target binding site for a given nucleicacid-targeting nucleic acid, by counting the number of times aparticular binding site is read. The library of sequenced nucleic acidscan include target nucleic acids and off-target nucleic acids.

Off-target binding activity can be determined by additional experimentalmethods. The experimental methods can comprise inserting a donoroligonucleotide into a cleaved site (Tsai, S. Q. et al., “GUIDE-seqenables genome wide profiling of off-target cleavage by CRISPR-Casnucleases” Nature Biotech. (2015) 33:187-197). The genomic DNA is thenfragmented, adapters are appended, and PCR is performed with primerscomplementary to the donor oligonucleotide and adapter sequences. Theamplified sequences are sequenced and then mapped back to a referencegenome. Other experimental methods rely on exploiting double-strandedbreak induced translocations of genomic DNA to experimentally induce(via the creation of double-stranded breaks) genomic “bait” sites(Frock, R. L. et. al. “Genome-wide detection of DNA double-strandedbreaks induced by engineered nucleases” Nature Biotech. (2015)33:179-186). Genomic DNA is subsequently fragmented, adapters areappended, and PCR is performed with primers complementary to the known“bait” site and adapter sequence. The amplified sequences are sequencedand then mapped back to a reference genome

In some embodiments, Cas9 and/or dCas9 proteins may be modified or fusedto additional protein domains. The fused additional protein domains mayenhance the ability to block, impair, or inactivate active Cas9complexes. Examples of fusion proteins including a Cas9 or dCas9 proteininclude, but are not limited to a nuclease, a transposase, a methylase,a transcription factor repressor or activator domain (e.g., such as KRABand VP16), co-repressor and co-activator domains, DNA methyltransferases, histone acetyltransferases, histone deacetylases, and DNAcleavage domains (e.g., a cleavage domain from the endonuclease FokI).In some embodiments, a non-native sequence can confer new functions tothe fusion protein. Such functions include, but are not limited to thefollowing: methyltransferase activity, demethylase activity, deaminationactivity, dismutase activity, alkylation activity, depurinationactivity, oxidation activity, pyrimidine dimer forming activity,integrase activity, transposase activity, recombinase activity,polymerase activity, ligase activity, helicase activity, photolyaseactivity, glycosylase activity, acetyltransferase activity, deacetylaseactivity, kinase activity, phosphatase activity, ubiquitin ligaseactivity, deubiquitinating activity, adenylation activity, deadenylationactivity, sumoylating activity, desumoylating activity, ribosylationactivity, deribosylation activity, myristoylation activity, remodellingactivity, protease activity, oxidoreductase activity, transferaseactivity, hydrolase activity, lyase activity, isomerase activity,synthase activity, synthetase activity, demyristoylation activity, andany combinations thereof.

In some instances, a donor polynucleotide is inserted into the targetnucleic acid, when the target nucleic acid is cleaved. The methods can,for example, therefore be used to modify genomic DNA in a eukaryoticcell isolated from an organism. Further, the methods can also comprisecontacting the nucleic acid target sequence in the genomic DNA with adonor polynucleotide wherein the modification comprises that at least aportion of the donor polynucleotide is integrated at the nucleic acidtarget sequence.

Donor polynucleotide insertion can be performed by the homologousrecombination machinery of the cell. The donor polynucleotide maycomprise homology arms that are partially or fully complementary to theregions of the target nucleic acid outside of the break point. Donorpolynucleotide insertion can also be performed by non-homologous DNArepair machinery of the cell, where no homology arms are required. Adiscussion of donor polynucleotides is presented more fully below.

In an embodiment, the donor polynucleotide can be tethered to thesgRNA/dCas9 complex to position it near the cleavage site targeted bythe active sgRNA/Cas9 complex. See, FIG. 7A. In this way, homologydirected repair, as described below, can be achieved at higher rates.

One particular embodiment of the methods described herein is illustratedin FIGS. 4 and 5. FIG. 4 depicts an example of undesirable off-targetbinding and cleavage of a nuclease during genome engineering. A targetnucleic acid 115 can be contacted with a complex comprising asite-directed polypeptide (e.g., Cas9) 105 and a nucleic acid-targetingnucleic acid (e.g., a sgRNA) 110. The complex comprising the Cas9 105and sgRNA 110 can bind to a target nucleic acid 120. In some instances,the complex comprising the Cas9 105 and sgRNA 110 can bind to anoff-target nucleic acid 125. In a cleavage step 130, the Cas9 of thecomplex can cleave 135 the target nucleic acid 120 and the off-targetnucleic acid 125, thereby generating off-target effects.

FIG. 5 depicts an exemplary embodiment of reducing off-target bindingand cleavage events using dCas9 blockers. A target nucleic acid 215 canbe contacted with a complex comprising a site-directed polypeptide(e.g., Cas9) 205 and a nucleic acid-targeting nucleic acid (e.g.,single-guide RNA) 210. The complex comprising the Cas9 205 and sgRNA 210can bind to a target nucleic acid 220. In some instances, the complexcomprising the Cas9 205 and sgRNA 210 can bind to an off-target nucleicacid 225. Complexes comprising an engineered dCas9 protein 235 and anengineered sgRNA 236 can be introduced and contacted 230 with the targetnucleic acid. The dCas9 complexes can either displace or prevent thebinding of complexes comprising active Cas9 205. The active Cas9 205 cancleave 240/245 the target nucleic acid 220. The active Cas9 205 may notcleave the off-target nucleic acid 225 because the dCas9 235 ispreventing its binding and cleavage. In this way, off-target binding andcleavage may be prevented.

In another embodiment, the invention is directed to a method forincreasing the efficiency of nucleic acid insertion by HDR ornon-homologous repair mechanisms. As explained above, multiple repairpathways can compete at site-directed DNA breaks. Such breaks can berepaired through, for example, non-homologous end-joining (NHEJ) orhomology-directed repair (HDR). NHEJ can result in imperfect repair andthe addition or deletion of one or more bases, whereas HDR can beutilized to insert rationally designed exogenous DNA sequences. Repairof a double-strand break (DSB) in the presence of a donor polynucleotideresults in a portion of breaks faithfully repaired by HDR and a portionof breaks where another less reliable repair pathway, such as NHEJ, isengaged, resulting in mixed repair outcomes. Alternative repair pathwaysfor insertion of DNA using non-homologous mechanisms can also result inthe insertion of donor DNA at the break site.

HDR relies on the presence of a donor polynucleotide, a piece of DNAthat shares homology with sequences at or near a DNA break, that can beused to repair DNA breaks. Without wishing to be bound by any particulartheory or mechanism, in some embodiments, the present invention providesfor methods for using site-directed polypeptides (e.g., Cas9 nucleases)to create a substrate that will engage an alternative HDR pathway,similar to the single-strand annealing (SSA) branch of HDR, and willprevent competing DNA repair pathways, such as NHEJ, from repairing thebreak.

Single-strand annealing (SSA) is a process that is initiated when abreak is introduced between two repetitive sequences oriented in thesame direction. Four steps are generally necessary for the repair ofbreaks by SSA: (1) an end resection step which extends the repeatedsequences and forms long 3′-ssDNA; (2) an annealing step in which thetwo repetitive sequences are annealed together forming a flap structure;(3) a second resection step in which the flap structures formed by theregions between the repeats are resected and; (4) ligation of the ends.HDR at DNA nicks occurs via a mechanism sometimes termed“alternative-HDR” that shares many of the same genetic dependencies ofSSA such as inhibition by RAD51 and BRCA2.

The inventors herein have developed an engineered CRISPR system bygenerating at least two single-stranded nicks on the same strand of atarget double-stranded nucleic acid and providing a donor polynucleotidethat can anneal to the non-nicked strand. This results in the accurateinsertion of exogenous DNA with little background mutagenic end-joining.

This method employs tandem Cas9 molecules that comprise one or moremutations that convert the catalytically active Cas9 molecules intonickases. The nickases are targeted to specific sites using sgRNAsdesigned to target two sites on the same strand in a double-strandedtarget nucleic acid, to generate two nicks (i.e., single-strandedbreaks) on the targeted strand.

Any Cas9 molecule can be used, as described in detail above, so long asthe Cas9 functions as a nickase. In some embodiments, this can beaccomplished by introducing a point mutation in either of the twocatalytic residues (D10A and H840A, numbered relative to S. pyogenesCas9) of the gene encoding Cas9. An amino acid mutation at eitherposition in Cas9 results in the inactivation of the nuclease catalyticactivity and converts Cas9 to a nickase enzyme that makessingle-stranded breaks at the target sites. The Cas9 double mutant withchanges at amino acid positions D10A and H840A, however, completelyinactivates both the nuclease and nickase activities. Targetingspecificity is determined by complementary base-pairing of a sgRNA tothe genomic loci which include PAM sequences adjacent thereto.

The nickases can comprise any mutation that enables the Cas9 to cleaveonly one strand of a double-stranded target nucleic acid. For example,as explained above, the Cas9 (e.g., Cas9 from S. pyogenes) can comprisea D10A mutation in one of its nuclease domains, or in a correspondingresidue in an orthologous Cas9 to render the molecule a nickase. TheCas9 (e.g., Cas9 from S. pyogenes) can comprise a H840A mutation in oneof its nuclease domains, or a corresponding residue in an orthologousCas9 to render the molecule a nickase.

Accordingly, any Cas9 molecule that has nickase activity and only makessingle-stranded breaks can be used. As explained above, Cas9 proteinsare known and the Cas9 proteins can be derived from any bacterialspecies, subspecies or strain that encodes the same. Also useful areproteins encoded by Cas9 orthologs, Cas9-like synthetic proteins, andvariants and modifications thereof. In one embodiment, Cas9 orthologsare selected based on the particular protospacer adjacent motif (PAM)sequences present on the target nucleic acid. For example, S. pyogenesCas9 targets NGG sequences. One of skill in the art can readilydetermine the particular Cas9 to mutate based on the particularspecificity desired.

Moreover, the nickases used in the present methods should be paired suchthat nicks occur on the same strand. For example, both nickases used caninclude a D10A mutation, or both can include a H840A mutation. Onenickase can be a S. pyogenes Cas9 nickase and the other can be a nickasethat targets a PAM with a different adjacent sequence than targeted bythe S. pyogenes Cas9 nickase, such as a nickase designed from anorthologous Cas9 protein, so long as the same strand is nicked. Theappropriate nickases for use in the present methods are therefore basedon the nucleic acid target sequence and on a determination ofPAM-adjacent sequences present at the desired cleavage sites. In thisway, the method provides flexibility for single-stranded cleavage of thetarget nucleic acid.

The nickases can cleave the sense strand of the double-stranded targetnucleic acid or the anti-sense strand of the double-stranded targetnucleic acid (e.g., DNA). The nickases can both cleave the same strandof the double-stranded target nucleic acid.

The two nickases can be designed to cleave at a distance of at least 10,20, 30, 40, 50, 60, 70, 80, 90, or 100, 500, 1000, or 5000 or more basesaway from each other. The two nickases can be designed to cleave at adistance of at most 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100, 500,1000, or 5000 or more bases away from each other. The distance betweenthe nicks will determine the length of the donor polynucleotide to beprovided for insertion.

As explained above, once the target nucleic acid is nicked, a donorpolynucleotide can be directed to, and inserted into a target site ofinterest to modify the target nucleic acid. Targeted DNA modificationsusing donor polynucleotides for large changes (e.g., more than 100 bpinsertions or deletions) traditionally use plasmid-based donor templatesthat contain homology arms flanking the site of alteration. Each arm canvary in length, but is typically longer than about 200 bp for largeinsertion, the size of the arms depending on the size of the donorpolynucleotide and the target polynucleotide, as explained in detailabove.

For shorter modifications (e.g., less than 50 bp), single-strandedoligonucleotides such as DNA oligonucleotides, partially double-strandedolignucleotides, nicked double-stranded donors, and the like, can beused in place of targeting plasmids. In this embodiment, for example,single-stranded oligonucleotides containing flanking sequences withhomology in proximity to each nick, can be used, and can be oriented ineither the sense or antisense direction relative to the target locus.For single-stranded DNA oligonucleotide design, typically anoligonucleotide with around 100-150 bp total homology is used. Themutation is introduced in the middle, giving approximately 50-75 bphomology arms. However, these numbers can vary, depending on the size ofthe donor polynucleotide and the target polynucleotide. Non-symmetricalpolynucleotide donors can also be used that are composed of two DNAoligonucleotides. They are partially complementary, and each includes aflanking region of homology. For some modifications, the donorpolynucleotide can have at least one arm with approximately 10 bases ofhomology to the target sequence. For some modifications, the donorpolynucleotide can have at least one arm with less than 100 bases ofhomology to the target sequence. For other modifications, the donor canhave more than 100 bases of homology to the target sequence. In somecases, the donor can have homology arms of the same length. In othercases, the donor can have homology arms of different lengths. In somecases, at least one of the homology arms is of zero length.

Thus, a donor polynucleotide can be designed to anneal to thesingle-stranded gap that results from the nicks made by the twonickases. As explained above, the donor polynucleotide can additionallycomprise regions of homology with the sequences outside the breaks. Thesize of the regions of homology will be determined by the size of thetarget polynucleotide and can be at least 5, 10, 15, 20, 25, 30, 35 ormore nucleotides in length, the size depending on the size of the donorpolynucleotide and the target nucleic acid. The regions of homology canbe at most 5, 10, 15, 20, 25, 30, 35 or more nucleotides in length. Thedonor polynucleotide can be single-stranded. The single-stranded donorpolynucleotide can be inserted into the break created by the two tandemnickases.

FIG. 6 depicts an exemplary embodiment of the present methods. Here, twoCas9 D10A nickases are used in tandem to excise a single-stranded regionof DNA on the same strand of a target double-stranded nucleic acid. Asshown in FIG. 6A, two Cas9 nickases (in this case S. pyogenes Cas9nickases with D10A mutations in the HNH endonuclease domain) aretargeted to two spaced-apart positions on the sense strand of a targetpolynucleotide using two sgRNA/Cas9 nickase complexes. Targeting isaccomplished using a spacer sequence present in the sgRNA that has beendesigned to specifically target a complementary region of in the targetnucleic acid sequence. Binding specificity is determined by bothsgRNA-DNA base pairing and the PAM, in this case, NGG, juxtaposed to theDNA complementary region (see, e.g., Mojica F. J. et al., “Short motifsequences determine the targets of the prokaryotic CRISPR defencesystem” Microbiology (2009) 155:733-740; Shah S. A. et al., “Protospacerrecognition motifs: mixed identities and functional diversity” RNABiology (2013) 10:891-899; Jinek M. et al., “A programmabledual-RNA-guided DNA endonuclease in adaptive bacterial immunity” Science(2012) 337:816-821). The targeted single strand is then cleaved (FIG.6B) and the donor, with overlapping flanking regions, inserted (FIG.6C).

In another embodiment, the invention is directed to additional methodsfor increasing HDR. The current methodology for introducing a desiredchange into a gene includes transfecting, electroporating, ormicroinjecting a site-specific endonuclease and donor molecules into acell or embryo and using passive diffusion to locate the donor moleculesthroughout the nucleus (Lin, S. et al. “Enhanced homology-directed humangenome engineering by controlled timing of CRISPR/Cas9 delivery,” eLife(2014) Dec; doi: 10.7554/eLife.04766). However, this method of HDRtypically has low efficiency. Unlike passive diffusion, the methodsdescribed below position the donor molecule near the cut site toincrease HDR efficiency.

In these methods, one or more sgRNA/dCas9 complexes are used, along witha catalytically active sgRNA/Cas9 complex. The one or more sgRNA/dCas9complexes include a polynucleotide donor associated therewith toposition the donor polynucleotide near a target site in order toincrease HDR efficiency. Thus, the tethered dCas9 can position the donormolecule in an orientation that will increase the likelihood that thedonor molecule will be incorporated into the target site through HDR,thereby introducing a desired change to the target sequence.

As explained above, the donor polynucleotide can be double- orsingle-stranded DNA, RNA, a vector, plasmid, or the like and can be usedto transfer genetic information (i.e., polynucleotide sequences) fromthe donor at the site of the break in the target nucleic acid. The donorcan be used to insert or replace polynucleotide sequences in a targetsequence, for example, to introduce a polynucleotide that encodes aprotein or functional RNA (e.g., siRNA), to introduce a protein tag, tomodify a regulatory sequence of a gene, or to introduce a regulatorysequence to a gene (e.g. a promoter, an enhancer, an internal ribosomeentry sequence, a start codon, a stop codon, a localization signal, orpolyadenylation signal), to modify a nucleic acid sequence (e.g.,introduce a mutation), and the like.

A single sgRNA/dCas9 complex can be used with the associated donor, asshown in FIG. 7A. Alternatively, two such complexes can be used toposition the donor across the cut site as shown in FIG. 7B. The dCas9and Cas9 molecules and guide polynucleotides used in the complexes canbe any of those as described above.

When one sgRNA/dCas9 complex is used, the complex can target nucleicacid either upstream or downstream of the nucleic acid targeted by thecatalytically active sgRNA/Cas9 complex. A donor polynucleotide isassociated with the sgRNA/dCas9 complex. In this way, the donorpolynucleotide is brought into proximity with the cleaved target nucleicacid and HDR will insert at least a portion of the donor polynucleotideat the cleaved site.

When two sgRNA/dCas9 complexes are used, the second sgRNA/dCas9 complexis designed to target nucleic acid downstream of the catalyticallyactive sgRNA/Cas9 complex when the first sgRNA/dCas9 targets nucleicacid upstream of the catalytically active sgRNA/Cas9 complex.Alternatively, the second sgRNA/dCas9 complex is designed to targetnucleic acid upstream of the catalytically active sgRNA/Cas9 complexwhen the first sgRNA/dCas9 targets nucleic acid downstream of thecatalytically active sgRNA/Cas9 complex. Thus, the target for the activesgRNA/Cas9 complex is in a position between the two inactive complexes.Additionally, the 5′ end of the polynucleotide donor will be associatedwith one of the inactive sgRNA/dCas9 complexes and the 3′ end associatedwith the other of the inactive complexes such that the polynucleotidedonor is positioned across the cleavage site for insertion using HDR.One of skill in the art can readily determine which end of thepolynucleotide donor to associate with each complex based on the desiredtarget.

The donor is tethered to the complexes using methods well known in theart. To do so, the backbone of the sgRNA can be extended to include aregion complementary to the donor molecule. For example, the sgRNA inthe sgRNA/dCas9 complex can include a number of extra nucleotides, e.g.,5-20, such as 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20or even more, extra nucleotides at the 3′ end of the sgRNA that willbind in a complementary fashion to the 5′ or 3′ end of a single-strandedDNA donor polynucleotide. In this manner, the donor polynucleotide willbe positioned to interact with the sgRNA/Cas9-induced cut site and thecell's endogenous HDR machinery will incorporate the donor into thecleavage site. The sgRNA/dCas9 tethered donor polynucleotide ispositioned upstream or downstream of the double-stranded break and isavailable at a higher local concentration for HDR.

In all of the embodiments of the above-described methods, the variouscomponents can be provided to a cell or in vitro, for example, usingexpression cassettes encoding a Cas9, a dCas9, sgRNA; a donorpolynucleotide, etc. These components can be present on a singlecassette or multiple cassettes, in the same or different constructs.Expression cassettes typically comprise regulatory sequences that areinvolved in one or more of the following: regulation of transcription,post-transcriptional regulation, and regulation of translation.Expression cassettes can be introduced into a wide variety of organismsincluding bacterial cells, yeast cells, plant cells, and mammaliancells. Expression cassettes typically comprise functional regulatorysequences corresponding to the organism(s) into which they are beingintroduced.

In one aspect, all or a portion of the various components of the methodsare provided in vectors, including expression vectors, comprisingpolynucleotides coding for a Cas9, a dCas9, a sgRNA and/or a donorpolynucleotide. Vectors useful for practicing the present inventioninclude plasmids, viruses (including phage), and Integra table DNAfragments (i.e., fragments integratable into the host genome byhomologous recombination). A vector replicates and functionsindependently of the host genome, or may, in some instances, integrateinto the genome itself. Suitable replicating vectors will contain areplicon and control sequences derived from species compatible with theintended expression host cell. Transformed host cells are cells thathave been transformed or transfected with the vectors constructed usingrecombinant DNA techniques

General methods for construction of expression vectors are known in theart. Expression vectors for most host cells are commercially available.There are several commercial software products designed to facilitateselection of appropriate vectors and construction thereof, such asinsect cell vectors for insect cell transformation and gene expressionin insect cells, bacterial plasmids for bacterial transformation andgene expression in bacterial cells, yeast plasmids for celltransformation and gene expression in yeast and other fungi, mammalianvectors for mammalian cell transformation and gene expression inmammalian cells or mammals, viral vectors (including retroviral,lentiviral, and adenoviral vectors) for cell transformation and geneexpression and methods to easily enable cloning of such polynucleotides.SnapGene™ (GSL Biotech LLC, Chicago, Ill.;snapgene.com/resources/plasmid_files/your_time_is_valuable/), forexample, provides an extensive list of vectors, individual vectorsequences, and vector maps, as well as commercial sources for many ofthe vectors.

Expression cassettes typically comprise regulatory sequences that areinvolved in one or more of the following: regulation of transcription,post-transcriptional regulation, and regulation of translation.Expression cassettes can be introduced into a wide variety of organismsincluding bacterial cells, yeast cells, mammalian cells, and plantcells. Expression cassettes typically comprise functional regulatorysequences corresponding to the host cells or organism(s) into which theyare being introduced. Expression vectors can also includepolynucleotides encoding protein tags (e.g., poly-His tags,hemagglutinin tags, fluorescent protein tags, bioluminescent tags,nuclear localization tags). The coding sequences for such protein tagscan be fused to the coding sequences or can be included in an expressioncassette, for example, in a targeting vector.

In some embodiments, polynucleotides encoding one or more of the variouscomponents are operably linked to an inducible promoter, a repressiblepromoter, or a constitutive promoter.

Several expression vectors have been designed for expressing guidepolynucleotides. See, e.g., Shen, B. et al. “Efficient genomemodification by CRISPR-Cas9 nickase with minimal off-target effects”(2014) March 2. doi: 10.1038/nmeth.2857. 10.1038/nmeth.2857.Additionally, vectors and expression systems are commercially available,such as from New England Biolabs (Ipswich, Mass.) and ClontechLaboratories (Mountain View, Calif.). Vectors can be designed tosimultaneously express a target-specific sgRNA using a U2 or U6promoter, a Cas9 and/or dCas9, and if desired, a marker protein, formonitoring transfection efficiency and/or for furtherenriching/isolating transfected cells by flow cytometry.

Vectors can be designed for expression of various components of thedescribed methods in prokaryotic or eukaryotic cells. Alternatively,transcription can be in vitro, for example using T7 promoter regulatorysequences and T7 polymerase. Other RNA polymerase and promoter sequencescan be used.

Vectors can be introduced into and propagated in a prokaryote.Prokaryotic vectors are well known in the art. Typically a prokaryoticvector comprises an origin of replication suitable for the target hostcell (e.g., oriC derived from E. coli, pUC derived from pBR322, pSC101derived from Salmonella), 15A origin (derived from p15A) and bacterialartificial chromosomes). Vectors can include a selectable marker (e.g.,genes encoding resistance for ampicillin, chloramphenicol, gentamicin,and kanamycin). Zeocin™ (Life Technologies, Grand Island, N.Y.) can beused as a selection in bacteria, fungi (including yeast), plants andmammalian cell lines. Accordingly, vectors can be designed that carryonly one drug resistance gene for Zeocin for selection work in a numberof organisms. Useful promoters are known for expression of proteins inprokaryotes, for example, T5, T7, Rhamnose (inducible), Arabinose(inducible), and PhoA (inducible). Further, T7 promoters are widely usedin vectors that also encode the T7 RNA polymerase. Prokaryotic vectorscan also include ribosome binding sites of varying strength, andsecretion signals (e.g., mal, sec, tat, ompC, and pelB). In addition,vectors can comprise RNA polymerase promoters for the expression ofsgRNAs. Prokaryotic RNA polymerase transcription termination sequencesare also well known (e.g., transcription termination sequences from S.pyogenes).

Integrating vectors for stable transformation of prokaryotes are alsoknown in the art (see, e.g., Heap, J. T., et al., “Integration of DNAinto bacterial chromosomes from plasmids without a counter-selectionmarker,” Nucleic Acids Res. (2012) 40:e59).

Expression of proteins in prokaryotes is typically carried out inEscherichia coli with vectors containing constitutive or induciblepromoters directing the expression of either fusion or non-fusionproteins.

A wide variety of RNA polymerase promoters suitable for expression ofthe various components are available in prokaryotes (see, e.g., Jiang,Y., et al., “Multigene editing in the Escherichia coli genome via theCRISPR-Cas9 system,” Environ Microbiol. (2015) 81:2506-2514); Estrem, S.T., et al., (1999) “Bacterial promoter architecture: subsite structureof UP elements and interactions with the carboxy-terminal domain of theRNA polymerase alpha subunit,” Genes Dev. 15; 13(16):2134-47).

In some embodiments, a vector is a yeast expression vector comprisingone or more components of the above-described methods. Examples ofvectors for expression in Saccharomyces cerivisae include, but are notlimited to, the following: pYepSec1, pMFa, pJRY88, pYES2, and picZ.Methods for gene expression in yeast cells are known in the art (see,e.g., Methods in Enzymology, Volume 194, “Guide to Yeast Genetics andMolecular and Cell Biology, Part A,” (2004) Christine Guthrie and GeraldR. Fink (eds.), Elsevier Academic Press, San Diego, Calif.). Typically,expression of protein-encoding genes in yeast requires a promoteroperably linked to a coding region of interest plus a transcriptionalterminator. Various yeast promoters can be used to construct expressioncassettes for expression of genes in yeast. Examples of promotersinclude, but are not limited to, promoters of genes encoding thefollowing yeast proteins: alcohol dehydrogenase 1 (ADH1) or alcoholdehydrogenase 2 (ADH2), phosphoglycerate kinase (PGK), triose phosphateisomerase (TPI), glyceraldehyde-3-phosphate dehydrogenase (GAPDH; alsoknown as TDH3, or triose phosphate dehydrogenase), galactose-1-phosphateuridyl-transferase (GALT), UDP-galactose epimerase (GAL10), cytochromeci (CYC1), acid phosphatase (PHOS) and glycerol-3-phosphatedehydrogenase gene (GPD1). Hybrid promoters, such as the ADH2/GAPDH,CYC1/GAL10 and the ADH2/GAPDH promoter (which is induced at lowcellular-glucose concentrations, e.g., about 0.1 percent to about 0.2percent) also may be used. In S. pombe, suitable promoters include thethiamine-repressed nmtl promoter and the constitutive cytomegaloviruspromoter in pTL2M.

Yeast RNA polymerase III promoters (e.g., promoters from 5S, U6 or RPR1genes) as well as polymerase III termination sequences are known in theart (see, e.g., www.yeastgenome.org; Harismendy, O., et al., (2003)“Genome-wide location of yeast RNA polymerase III transcriptionmachinery,” The EMBO Journal. 22(18):4738-4747.)

In addition to a promoter, several upstream activation sequences (UASs),also called enhancers, may be used to enhance polypeptide expression.Exemplary upstream activation sequences for expression in yeast includethe UASs of genes encoding these proteins: CYC1, ADH2, GAL1, GALT,GAL10, and ADH2. Exemplary transcription termination sequences forexpression in yeast include the termination sequences of the α-factor,CYC1, GAPDH, and PGK genes. One or multiple termination sequences can beused.

Suitable promoters, terminators, and coding regions may be cloned intoE. coli-yeast shuttle vectors and transformed into yeast cells. Thesevectors allow strain propagation in both yeast and E. coli strains.Typically, the vector contains a selectable marker and sequencesenabling autonomous replication or chromosomal integration in each host.Examples of plasmids typically used in yeast are the shuttle vectorspRS423, pRS424, pRS425, and pRS426 (American Type Culture Collection,Manassas, Va.). These plasmids contain a yeast 2 micron origin ofreplication, an E. coli replication origin (e.g., pMB1), and aselectable marker.

The various components can also be expressed in insects or insect cells.Suitable expression control sequences for use in such cells are wellknown in the art. In some embodiments, it is desirable that theexpression control sequence comprises a constitutive promoter. Examplesof suitable strong promoters include, but are not limited to, thefollowing: the baculovirus promoters for the piO, polyhedrin (polh), p6.9, capsid, UAS (contains a Gal4 binding site), Ac5, cathepsin-likegenes, the B. mori actin gene promoter; Drosophila melanogaster hsp70,actin, α-1-tubulin or ubiquitin gene promoters, RSV or MMTV promoters,copia promoter, gypsy promoter, and the cytomegalovirus IE genepromoter. Examples of weak promoters that can be used include, but arenot limited to, the following: the baculovirus promoters for the ie1,ie2, ieO, etl, 39K (aka pp31), and gp64 genes. If it is desired toincrease the amount of gene expression from a weak promoter, enhancerelements, such as the baculovirus enhancer element, hr5, may be used inconjunction with the promoter.

For the expression of some of the components of the present invention ininsects, RNA polymerase III promoters are known in the art, for example,the U6 promoter. Conserved features of RNA polymerase III promoters ininsects are also known (see, e.g., Hernandez, G., (2007) “Insect smallnuclear RNA gene promoters evolve rapidly yet retain conserved featuresinvolved in determining promoter activity and RNA polymerasespecificity,” Nucleic Acids Res. 2007 January; 35(1):21-34).

In another aspect, the various components are incorporated intomammalian vectors for use in mammalian cells. A large number ofmammalian vectors suitable for use with the systems of the presentinvention are commercially available (e.g., from Life Technologies,Grand Island, N.Y.; NeoBiolab, Cambridge, Mass.; Promega, Madison, Wis.;DNA2.0, Menlo Park, Calif.; Addgene, Cambridge, Mass.).

Vectors derived from mammalian viruses can also be used for expressingthe various components of the present methods in mammalian cells. Theseinclude vectors derived from viruses such as adenovirus, papovirus,herpesvirus, polyomavirus, cytomegalovirus, lentivirus, retrovirus,vaccinia and Simian Virus 40 (SV40) (see, e.g., Kaufman, R. J., (2000)“Overview of vector design for mammalian gene expression,” MolecularBiotechnology, Volume 16, Issue 2, pp 151-160; Cooray S., et al., (2012)“Retrovirus and lentivirus vector design and methods of cellconditioning,” Methods Enzymol. 507:29-57). Regulatory sequencesoperably linked to the components can include activator bindingsequences, enhancers, introns, polyadenylation recognition sequences,promoters, repressor binding sequences, stem-loop structures,translational initiation sequences, translation leader sequences,transcription termination sequences, translation termination sequences,primer binding sites, and the like. Commonly used promoters areconstitutive mammalian promoters CMV, EF1a, SV40, PGK1 (mouse or human),Ubc, CAG, CaMKIIa, and beta-Act. and others known in the art (Khan, K.H. (2013) “Gene Expression in Mammalian Cells and its Applications,”Advanced Pharmaceutical Bulletin 3(2), 257-263). Further, mammalian RNApolymerase III promoters, including H1 and U6, can be used.

In some embodiments, a recombinant mammalian expression vector iscapable of preferentially directing expression of the nucleic acid in aparticular cell type (e.g., using tissue-specific regulatory elements toexpress a polynucleotide). Tissue-specific regulatory elements are knownin the art and include, but are not limited to, the albumin promoter,lymphoid-specific promoters, neuron-specific promoters (e.g., theneurofilament promoter), pancreas-specific promoters, mammarygland-specific promoters (e.g., milk whey promoter), and in particularpromoters of T cell receptors and immunoglobulins.Developmentally-regulated promoters are also encompassed, e.g., themurine hox promoters and the alpha-fetoprotein promoter.

Numerous mammalian cell lines have been utilized for expression of geneproducts including HEK 293 (Human embryonic kidney) and CHO (Chinesehamster ovary). These cell lines can be transfected by standard methods(e.g., using calcium phosphate or polyethyleneimine (PEI), orelectroporation). Other typical mammalian cell lines include, but arenot limited to: HeLa, U2OS, 549, HT1080, CAD, P19, NIH 3T3, L929, N2a,Human embryonic kidney 293 cells, MCF-7, Y79, SO-Rb50, Hep G2, DUKX-X11,J558L, and Baby hamster kidney (BHK) cells.

Methods of introducing polynucleotides (e.g., an expression vector) intohost cells are known in the art and are typically selected based on thekind of host cell. Such methods include, for example, viral orbacteriophage infection, transfection, conjugation, electroporation,calcium phosphate precipitation, polyethyleneimine-mediatedtransfection, DEAE-dextran mediated transfection, protoplast fusion,lipofection, liposome-mediated transfection, particle gun technology,direct microinjection, and nanoparticle-mediated delivery.

As explained above, one aspect of the present invention provides methodsof increasing Cas9-mediated genome engineering efficiency by eitherdecreasing the number of off-target nucleic acid double-stranded breaks,and/or enhancing HDR of a cleaved target nucleic acid, thus modifyinggenomes using HDR. The present invention also includes methods ofmodulating in vitro or in vivo transcription using the variouscomponents and complexes described herein. In one embodiment, asgRNA/dCas protein complex can repress gene expression by interferingwith transcription when the sgRNA directs DNA target binding of thecomplex to the promoter region of the gene. Use of the complexes toreduce transcription also includes complexes wherein the dCas protein isfused to a known down regulator of a target gene (e.g., a repressorpolypeptide). For example, expression of a gene is under the control ofregulatory sequences to which a repressor polypeptide can bind. A guidepolynucleotide can direct DNA target binding of a repressor proteincomplex to the DNA sequences encoding the regulatory sequences oradjacent the regulatory sequences such that binding of the repressorprotein complex brings the repressor protein into operable contact withthe regulatory sequences. Similarly, dCas9 can be fused to an activatorpolypeptide to activate or increase expression of a gene under thecontrol of regulatory sequences to which an activator polypeptide canbind.

Another method of the present invention is the use of sgRNA/dCas9complexes in methods to isolate or purify regions of genomic DNA (gDNA).In an embodiment of the method, a dCas protein is fused to an epitope(e.g., a FLAG® epitope, Sigma Aldrich, St. Louis, MO) and a sgRNAdirects DNA target binding of a sgRNA/dCas9 protein-epitope complex toDNA sequences within the region of genomic DNA to be isolated orpurified. An affinity agent is used to bind the epitope and theassociated gDNA bound to the sgRNA/dCas9 protein-epitope complex.

The present invention also encompasses gene-therapy methods forpreventing or treating diseases, disorders, and conditions using thevarious methods described herein. In one embodiment, a gene-therapymethod uses the introduction of nucleic acid sequences into an organismor cells of an organism (e.g., patient) to achieve expression ofcomponents of the present invention to provide modification of a targetfunction. For example, cells from an organism may be engineered, exvivo, by (i) introduction of vectors comprising expression cassettesexpressing the various components, (ii) direct introduction of sgRNAand/or donor polynucleotides and Cas9 and/or dCas9 proteins, or (iii)introduction of combinations of these components. The engineered cellsare provided to an organism (e.g., patient) to be treated.

Examples of gene-therapy and delivery techniques for therapy are knownin the art (see, e.g., Kay, M. A., (2011) “State-of-the-art gene-basedtherapies: the road ahead,” Nature Reviews Genetics 12, 316-328; Wang,D., et al., (2014) “State-of-the-art human gene therapy: part I. Genedelivery technologies,” Discov Med. 18(97):67-77; Wang, D., et al.,(2014) “State-of-the-art human gene therapy: part II. Gene therapystrategies and clinical applications,” Discov Med. 18(98):151-61; “TheClinibook: Clinical Gene Transfer State of the Art,” OdileCohen-Haguenauer (Editor), EDP Sciences (Oct. 31, 2012), ISBN-10:2842541715).

In some aspects, components of the present invention are delivered usingnanoscale delivery systems, such as nanoparticles. Additionally,liposomes and other particulate delivery systems can be used. Forexample, vectors including the components of the present methods can bepackaged in liposomes prior to delivery to the subject or to cellsderived therefrom, such as described in U.S. Pat. Nos. 5,580,859;5,549,127; 5,264,618; 5,703,055, all incorporated herein by reference intheir entireties. Lipid encapsulation is generally accomplished usingliposomes which are able to stably bind or entrap and retain nucleicacid.

The methods described herein can also be used to generate non-humangenetically modified organisms. Generally, in these methods expressioncassettes comprising polynucleotide sequences of the various components,as well as a targeting vector are introduced into zygote cells tosite-specifically introduce a selected polynucleotide sequence at a DNAtarget sequence in the genome to generate a modification of the genomicDNA. The selected polynucleotide sequence is present in the targetingvector. Modifications of the genomic DNA typically include, insertion ofa polynucleotide sequence, deletion of a polynucleotide sequence, ormutation of a polynucleotide sequence, for example, gene correction,gene replacement, gene tagging, transgene insertion, gene disruption,gene mutation, mutation of gene regulatory sequences, and so on. In oneembodiment of methods to generate non-human genetically modifiedorganisms, the organism is a mouse. Generating transgenic mice involvesfive basic steps (Cho A., et al., “Generation of Transgenic Mice,”Current protocols in cell biology, (2009); CHAPTER.Unit-19.11): (1)purifying a transgenic construct (e.g., expression cassettes comprisingthe various components of the various methods described herein, as wellas a targeting vector); (2) harvesting donor zygotes; (3) microinjectingthe transgenic construct into the mouse zygote; (4) implanting themicroinjected zygotes into pseudo-pregnant recipient mice; and (5)performing genotyping and analysis of the modification of the genomicDNA established in founder mice.

In another embodiment of methods to generate non-human geneticallymodified organisms, the organism is a plant. Thus, the componentsdescribed herein are used to effect efficient, cost-effective geneediting and manipulation in plant cells. It is generally preferable toinsert a functional recombinant DNA in a plant genome at a non-specificlocation. However, in certain instances, it may be useful to usesite-specific integration to introduce a recombinant DNA construct intothe genome. Recombinant vectors for use in plant are known in the art.The vectors can include, for example, scaffold attachment regions(SARs), origins of replication, and/or selectable markers.

Methods and compositions for transforming plants by introducing arecombinant DNA construct into a plant genome includes any of a numberof methods known in the art. One method for constructing transformedplants is microprojectile bombardment. Agrobacterium-mediatedtransformation is another method for constructing transformed plants.Alternatively, other non-Agrobacterium species (e.g., Rhizobium) andother prokaryotic cells that are able to infect plant cells andintroduce heterologous nucleotide sequences into the infected plantcell's genome can be used. Other transformation methods includeelectroporation, liposomes, transformation using pollen or viruses,chemicals that increase free DNA uptake, or free DNA delivery by meansof microproj ectile bombardment. DNA constructs of the present inventionmay be introduced into the genome of a plant host using conventionaltransformation techniques that are well known to those skilled in theart (see, e.g., “Methods to Transfer Foreign Genes to Plants,” YNarusaka, et al., cdn.intechopen.com/pdfs-wm/30876.pdf).

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. From the abovedescription and the following Examples, one skilled in the art canascertain essential characteristics of this invention, and withoutdeparting from the spirit and scope thereof, can make changes,substitutions, variations, and modifications of the invention to adaptit to various usages and conditions. Such changes, substitutions,variations, and modifications are also intended to fall within the scopeof the present disclosure.

EXPERIMENTAL

Aspects of the present invention are further illustrated in thefollowing Examples. Efforts have been made to ensure accuracy withrespect to numbers used (e.g., amounts, concentrations, percent changes,etc.) but some experimental errors and deviations should be accountedfor. Unless indicated otherwise, temperature is in degrees Centigradeand pressure is at or near atmospheric. It should be understood thatthese Examples, while indicating some embodiments of the invention, aregiven by way of illustration only.

The following examples are not intended to limit the scope of what theinventors regard as various aspects of the present invention.

I. Use of Catalytically Inactive Cas9 Proteins as Site Specific NucleaseBlockers

The following examples 1-4 illustrate the use of a catalyticallyinactive Cas9 (i.e. “dead” Cas9 or dCas9) to reduce off-target nucleaseactivity in eukaryotic cells. Additionally, this example shows how onecan identify a specific spacer sequence (for incorporation into a sgRNAor crRNA) that is effective at blocking nuclease off-target activity ineukaryotic cells. Where the term sgRNA or single-guide RNA is used, itis understood by one skilled in the art that other guide polynucleotidesystems, such as a crRNA/tracrRNA dual-guide system, present analternative means of guiding dCas9 to the targeted site.

Example 1 Production of dCas9 Nuclease Blocker and Cas9 NucleaseComponents

sgRNA components of dCas9 nuclease-blocker (dCas9-NB, i.e. a Cas9lacking catalytic activity) ribonucleoprotein (RNP) complexes (alsotermed “sgRNA/dCas9 complex” herein) and catalytically active Cas9nuclease RNP complexes (also termed “sgRNA/Cas9 complex” herein) wereproduced by in vitro transcription (e.g., T7 Quick High Yield RNASynthesis Kit, New England Biolabs, Ipswich, Mass.) from double-strandedDNA templates incorporating a T7 promoter at the 5′ end of the DNAsequence. Polymerase Chain Reaction (PCR) using 5′ overlapping primerswas used to assemble the double-stranded DNA templates for transcriptionof sgRNA components. The sgRNA components, templates and primers usedare identified in Table 1. The sequences of the oligonucleotide primersused in the assembly are presented in Table 2.

TABLE 1 Overlapping Primers for Generation of dCas9-NB and Cas9 NucleasesgRNA Component Templates Component Target for DNA binding Primers Cas9sgRNA VEGFA A, B, C, D, E dCas9 sgRNA AAVS1 A, B, C, D, F dCas9 sgRNAVEGFA off-target A2 A, B, C, D, G dCas9 sgRNA VEGFA off-target A3 A, B,C, D, H dCas9 sgRNA VEGFA off-target A4 A, B, C, D, I dCas9 sgRNA VEGFAoff-target B1 A, B, C, D, J dCas9 sgRNA VEGFA off-target B2 A, B, C, D,K dCas9 sgRNA VEGFA off-target C1 A, B, C, D, L dCas9 sgRNA VEGFAoff-target C3 A, B, C, D, M dCas9 sgRNA VEGFA off-target D2 A, B, C, D,N dCas9 sgRNA VEGFA off-target D3 A, B, C, D, O *DNA primer sequencesare shown in Table 2

TABLE 2 DNA Primer Sequences Used A AAAAAAAGCACCGACTCGGTGCC SEQ ID NO: 1B AGTAATAATACGACTCACTATAG SEQ ID NO: 2 CGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTT SEQ ID NO: 3 ATCAAC DAAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTTGATAACGG SEQ ID NO: 4 ACTAGC ETAATACGACTCACTATAGGGTGGGGGGAGTTTGCTCCGTTTTAGA SEQ ID NO: 5 GCTAGAAATAGCF TAATACGACTCACTATAGGGGCCACTAGGGACAGGATGTTTTAG SEQ ID NO: 6AGCTAGAAATAGC G TAATACGACTCACTATAGTGGAGGGAGTTTGCTCCTGGTTTTAGASEQ ID NO: 7 GCTAGAAATAGC HTAATACGACTCACTATAGGACGGATTTGTGGGATGGAGTTTTAGA SEQ ID NO: 8 GCTAGAAATAGCI TAATACGACTCACTATAGCAGGACATTCTGACACCCCGTTTTAGA SEQ ID NO: 9GCTAGAAATAGC J TAATACGACTCACTATAGGAGGCTCCCATCACGGGGGGTTTTAGSEQ ID NO: 10 AGCTAGAAATAGC KTAATACGACTCACTATAGTGGGGATCACAGGTTCCCCGTTTTAGA SEQ ID NO: 11 GCTAGAAATAGCL TAATACGACTCACTATAGAGAGCTCTTCTGACTACAGGTTTTAGA SEQ ID NO: 12GCTAGAAATAGC M TAATACGACTCACTATAGGACCAAATGAGACCAGTCCGTTTTAGSEQ ID NO: 13 AGCTAGAAATAGC NTAATACGACTCACTATAGCCCATTATGATAGGGAGGGGTTTTAGA SEQ ID NO: 14 GCTAGAAATAGCO TAATACGACTCACTATAGCTCCTGGGGATGGAAGGGCGTTTTAG SEQ ID NO: 15AGCTAGAAATAGC P CACTCTTTCCCTACACGACGCTCTTCCGATCTCCAGATGGCACATSEQ ID NO: 16 TGTCAGA Q GGAGTTCAGACGTGTGCTCTTCCGATCTCCTAGTGACTGCCGTCTSEQ ID NO: 17 GC R GGAGTTCAGACGTGTGCTCTTCCGATCTacctggccATCATCCTTCTASEQ ID NO: 18 S CACTCTTTCCCTACACGACGCTCTTCCGATCTCAGCAGACCCACTSEQ ID NO: 19 GAGTCAA T CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTGACTGGAGTSEQ ID NO: 20 TCAGACGTGTGCTC UAATGATACGGCGACCACCGAGATCTACACNNNNNNNNACACTCT SEQ ID NO: 21TTCCCTACACGACG

The PCR reaction to assemble the sgRNA DNA template proceeded asfollows: Three “internal” DNA primers (C, D, E-O, Table 2) were presentat a concentration of 2 nM each. Two “outer” DNA primers (A, B, Table 2)corresponding to the T7 promoter and the 3′ end of the RNA sequence werepresent at 640 nM to drive the amplification reaction. PCR reactionswere performed using Kapa HiFi Hotstart™ PCR kit (Kapa Biosystems, Inc.,Wilmington, Mass.) as per manufacturer's recommendation. PCR assemblyreactions were carried out using the following thermal cyclingconditions: 98° C. for 2 minutes, 35 cycles of 15 seconds at 98° C., 15seconds at 62° C., 15 seconds at 72° C., and a final extension at 72° C.for 2 minutes.

Between approximately 0.25-0.5 μg of the DNA template for the sgRNAcomponents were transcribed using T7 High Yield RNA synthesis Kit (NewEngland Biolabs, Ipswich, Mass.) for approximately 16 hours at 37° C.Transcription reactions were DNAse I-treated (New England Biolabs,Ipswich, Mass.). The quality of the transcribed RNA was checked bycapillary electrophoresis on a Fragment Analyzer (Advanced AnalyticalTechnologies, Inc., Ames, Iowa). The Cas9 and dCas9-NB sgRNA componentsequences were as follows:

TABLE 3 Cas9 and dCas9-NB sgRNA Component Sequences DNA targetRNA Sequence (5′ to 3′) VEGFAGGGUGGGGGGAGUUUGCUCCGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUUUUU (SEQ ID NO: 22) AAVS1GGGGCCACUAGGGACAGGAUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA GCUUUUUUU (SEQ ID NO: 23)VEGFA  GUCGGUGUGGAGGGAGUUUGCUCCUGGUUUUAGAGCUAGAAAUA off- GCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGG targetCACCGAGUCGGUGCUUUUUUU (SEQ ID NO: 24) A2 VEGFA GGACGGAUUUGUGGGAUGGAGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGUCGGUGCUUUUUUU (SEQ ID NO: 25) A3 VEGFA GCAGGACAUUCUGACACCCCGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGUCGGUGCUUUUUUUU (SEQ ID NO: 26) A4 VEGFA GGAGGCUCCCAUCACGGGGGGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGUCGGUGCUUUUUUU (SEQ ID NO: 27) B1 VEGFA GUGGGGAUCACAGGUUCCCCGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGGUCGGUCUUUUUUU (SEQ ID NO: 28) B2 VEGFA GAGAGCUCUUCUGACUACAGGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGUCGGUGCUUUUUUU (SEQ ID NO: 29) C1 VEGFA GGACCAAAUGAGACCAGUCCGUUUUAGAGCUAGAAAUAGCAAGU off- UAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGGUCGGUCUUUUUUU (SEQ ID NO: 30) C3 VEGFA GCCCAUUAUGAUAGGGAGGGGUUUUAGAGCUAGAAAUAGCAAGU off- AUAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGA targetGUCGGUGCUUUUUUU (SEQ ID NO: 31) D2 VEGFA GCUCCUGGGGAUUGGAAGGGCGUUUUAGAGCUAGAAAUAGCAAG off- UUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCG targetAGUCGGUGCUUUUUUU (SEQ ID NO: 32) D3

Protein components of Cas9 and dCas9-NB RNPs were expressed frombacterial expression vectors in E. coli (BL21 (DE3)) and purified usingaffinity, ion exchange and size exclusion chromatography according tomethods described in Jinek et al., 2012. The coding sequence for S.pyogenes Cas9 included the two nuclear localization sequences (NLS) atthe C-terminus. The dCas9 variant of NLS-tagged Cas9, in which activesite residues from both nuclease domains were mutated (Jinek, et al.,2012), was prepared by introducing mutations into the coding sequence ofS. pyogenes Cas9 by site directed mutagenesis (Q5 Site-directedMutagenesis Kit, New England Biolabs, Ipswich, Mass.). This method forproduction of Cas9 and/or dCas9-NB RNPs can be applied to the productionof other Cas9 and/or dCas9-NB RNPs as described herein.

Example 2 Deep Sequencing Analysis for Detection of Target Modificationsin Eukaryotic Cells

This example illustrates the use of a MiSeq Sequencer (Illumina, SanDiego, Calif.) for deep sequencing analysis to evaluate and compare theDNA cleavage (as inferred from non-homologous end joining, or NHEJ) ofselected Cas9 nuclease off-target sequences in the presence and absenceof dCas9-NBs. In this example, Cas9 was directed by a specific sgRNA toa sequence (GGGTGGGGGGAGTTTGCTCCTGG, SEQ ID NO:82) within the human geneVascular Endothelial Growth Factor A (VEGFA). dCas9 was directed towardsan off-target, sequence (GGATGGAGGGAGTTTGCTCCTGG, SEQ ID NO:83) known tobe targeted by Cas9 RNP nuclease off-target to prevent off-targetcleavage as well as a sequence (GGGGCCACTAGGGACAGGATTGG, SEQ ID NO:84)within the control locus, Adeno-Associated Virus Integration Site 1(AAVSJ).

A. Transfection of Cas9/dCas9-NB RNPs:

To assemble Cas9 and dCas9 RNPs, 1.3 μl of sgRNA (corresponding toapproximately 1-9 μg or approximately 25-250 pmol) were incubated for 2minutes at 95° C. then allowed to equilibrate to room temperature forabout 5 minutes. Subsequently, Cas9 and dCas9 were mixed with acorresponding sgRNA to form RNPs in reaction buffer (20 mM HEPES, pH7.5, 100 mM KCL, 5 mM MgCl₂, 5% glycerol). 20 pmols Cas9 were combinedwith the target sgRNA, and 0 or 20 pmols of dCas9 were combined withoff-target directed sgRNAs, and functional RNPs were assembled byincubating at 37° C. for 10 min. Finally, 20 pmols Cas9 RNP was combinedwith 0 (i.e. just the dCas9-NB sgRNA component) or 20 pmols dCas9 RNPimmediately prior to transfection into cells. Experiments were performedin triplicate.

Cas9/dCas9-NB RNP complexes were transfected into K562 cells (ATCC,Manassas, Va.), using the Nucleofector® 96-well Shuttle System (Lonza,Allendale, N.J.) and the following protocol: RNP complexes weredispensed in a 5 μL final volume into individual wells of a 96-wellplate. K562 cells suspended in media were transferred from culture flaskto a 50 mL conical, cells were then pelleted by centrifugation for 3minutes at 200×g, the culture medium aspirated and washed once withcalcium and magnesium-free PBS. K562 cells were then pelleted bycentrifugation for 3 minutes at 200×g, the PBS aspirated and cell pelletwas resuspended in 10 mL of calcium and magnesium-free PBS.

K562 cells were counted using the Countess® II Automated Cell Counter(Life Technologies, Grand Island, N.Y.). 4.2×10⁷ cells were transferredto a 50 ml tube and pelleted. The PBS was aspirated and the cells wereresuspended in Nucleofector™ SF (Lonza, Allendale, N.J.) solution to adensity of 1×10⁷ cells/mL. 20 μL of the cell suspension were then addedto individual wells containing 5 μL of RNP complexes and the entirevolume was transferred to the wells of a 96-well Nucleocuvette™ Plate(Lonza, Allendale, N.J.). The plate was loaded onto the Nucleofector™96-well Shuttle™ (Lonza, Allendale, N.J.) and cells were nucleofectedusing the 96-FF-120 Nucleofector™ program (Lonza, Allendale, N.J.).Post-nucleofection, 80 μL Iscove's Modified Dulbecco's Media (IMDM, LifeTechnologies, Grand Island, N.Y.), supplemented with 10% FBS (FisherScientific, Pittsburgh, Pa.) and supplemented with penicillin andstreptomycin (Life Technologies, Grand Island, N.Y.), was added to eachwell and 50 μL of the cell suspension was transferred to a 96-well cellculture plate containing 150 μL pre-warmed IMDM complete culture medium.The plate was then transferred to a tissue culture incubator andmaintained at 37° C. in 5% CO₂ for approximately 48 hours.

Genomic DNA (gDNA) was isolated from K562 cells 48 hours afterCas9/dCas9-NB transfection using 50 μL QuickExtract DNA Extractionsolution (Epicentre, Madison, Wis.) per well followed by incubation at37° C. for 10 minutes. 50 μL water was added to the samples, and nextthey were incubated at 75° C. for 10 minutes and 95° C. for 5 minutes tostop the reaction. sgDNA was stored at −20° C. until further processing.

B. Sequencing Library Preparation:

Using previously isolated sgDNA, a first PCR was performed using Q5 HotStart High-Fidelity 2× Master Mix (New England Biolabs, Ipswich, Mass.)at lx concentration, primers at 0.5 μM each, 3.75 μL of sgDNA in a finalvolume of 10 μL and amplified 98° C. for 1 minutes, 35 cycles of 10 s at98° C., 20 s at 60° C., 30 s at 72° C., and a final extension at 72° C.for 2 min. PCR reaction was diluted 1:100 in water. Target-specificprimers are shown in Table 4:

TABLE 4 Target-specific Primers Used for Sequencing Target Primers VEGFAon-target P, Q VEGFA off-target 1 R, S *DNA primer sequences are shownin Table 2

A second ‘barcoding’ PCR was set up using unique primers for each samplefacilitating multiplex sequencing (oligonucleotides T and U in Table 2,where a unique 8 bp index sequence, denoted by “NNNNNNNN (SEQ ID NO:33)”allowed demultiplexing of each amplicon during sequence analysis).

The second PCR was performed using Q5 Hot Start High-Fidelity 2× MasterMix (New England Biolabs, Ipswich, Mass.) at 1× concentration, primersat 0.5 μM each, 1 μL of 1:100 diluted first PCR, in a final volume of 10μL and amplified 98° C. for 1 minute, 12 cycles of 10 s at 98° C., 20 sat 60° C., 30 s at 72° C., and a final extension at 72° C. for 2 min.PCR reactions were pooled into a single microfuge tube for SPRIselectbead (Beckman Coulter, Pasadena, Calif.) based clean up of amplicons forsequencing.

To pooled amplicons, 0.9× volumes of SPRIselect beads were added, mixedand incubated at room temperature (RT) for 10 minutes. The microfugetube was placed on a magnetic tube stand (Beckman Coulter, Pasadena,Calif.) until solution had cleared. Supernatant was removed anddiscarded, and the residual beads were washed with 1 volume of 85%ethanol, and incubated at RT for 30 s. After incubation, ethanol wasaspirated and beads were air dried at RT for 10 min. The microfuge tubewas then removed from the magnetic stand and 0.25× volumes of water(Qiagen, Venlo, Limburg) was added to the beads, mixed vigorously, andincubated for 2 min. at RT. The microfuge tube was spun in amicrocentrifuge to collect the contents of the tube, and was thenreturned to the magnet, incubated until solution had cleared, and thesupernatant containing the purified amplicons were dispensed into aclean microfuge tube. The purified amplicon library was quantified usingthe Nanodrop™ 2000 system (Thermo Fisher Scientific, Wilmington, Del.).

The amplicon library was normalized to 4 nM concentration as calculatedfrom optical absorbance at 260 nm (Nanodrop™, Thermo Fisher Scientific,Wilmington, Del.) and size of the amplicons. Library was analyzed onMiSeq Sequencer with MiSeq Reagent Kit v2, 300 Cycles (Illumina, SanDiego, Calif.), with two 151-cycle paired-end run plus two eight-cycleindex reads.

C. Deep Sequencing Data Analysis:

The identity of products in the sequencing data was analyzed based uponthe index barcode sequences adapted onto the amplicons in the secondround of PCR. A computational script was used to process the MiSeq databy executing the following tasks:

-   -   1. Reads were aligned to the human genome (build GRCh38/38)        using Bowtie (bowtie-bio.sourceforge.net/index.shtml) software.    -   2. Aligned reads were compared to wild type loci; reads not        aligning to any part of the loci were discarded.    -   3. Reads matching wild-type sequence were tallied. Reads with        indels (surrounding 5 bp from the Cas9 cut site) were        categorized by indel type and tallied.    -   4. Total indel reads were divided by the sum of wild-type reads        and indel reads to give percent-mutated reads.

FIG. 8 shows the effects of dCas9-NBs on VEGFA sgRNA/Cas9 on-targetediting at the VEGFA locus. As can be seen from the data in the figure,the addition of a dCas9-NB targeted to the VEGFA on-target locusinhibits on-target editing, while dCas9-NBs targeted to distinct regionsdo not have a significant effect (n=3, error bars show standarddeviation, *p<0.05, student's t-test (two-tailed) comparing 3 vs. 4).

FIG. 9 shows the effects of dCas9-NBs on VEGFA sgRNA/Cas9 off-targetediting at a known VEGFA off-target locus on human chromosome 15. As canbe seen from the data in the figure, the addition of a dCas9-NB to theoff-target locus either by the VEGFA on-target sgRNA or an sgRNAtargeted specifically to the chromosome 15 off-target locus, impairscleavage, while dCas9-NBs targeted to distinct regions do not have asignificant effect (n=3, error bars show standard deviation, *p<0.05,student's t-test (two-tailed) comparing 3 vs. 4, 5 vs. 6, 7 vs. 8).

A description of the samples used in these experiments and FIGS. 8 and 9are shown in Table 5:

TABLE 5 Sample Descriptions for FIGS. 8 and 9 Sample Description 1 AAVS1sgRNA, 0 pmol dCas9 2 AAVS1 sgRNA, 20 pmol dCas9 3 VEGFA sgRNA, 0 pmoldCas9 4 VEGFA sgRNA, 20 pmol dCas9 5 VEGFA off-target sgRNA A2, 0 pmoldCas9 6 VEGFA off-target sgRNA A2, 20 pmol dCas9 7 VEGFA off-targetsgRNA A3, 0 pmol dCas9 8 VEGFA off-target sgRNA A3, 20 pmol dCas9 9VEGFA off-target sgRNA A4, 0 pmol dCas9 10 VEGFA off-target sgRNA A4, 20pmol dCas9 11 VEGFA off-target sgRNA B1, 0 pmol dCas9 12 VEGFAoff-target sgRNA B1, 20 pmol dCas9 13 VEGFA off-target sgRNA B2, 0 pmoldCas9 14 VEGFA off-target sgRNA B2, 20 pmol dCas9 15 VEGFA off-targetsgRNA C1, 0 pmol dCas9 16 VEGFA off-target sgRNA C1, 20 pmol dCas9 17VEGFA off-target sgRNA C3, 0 pmol dCas9 18 VEGFA off-target sgRNA C3, 20pmol dCas9 19 VEGFA off-target sgRNA D2, 0 pmol dCas9 20 VEGFAoff-target sgRNA D2, 20 pmol dCas9 21 VEGFA off-target sgRNA D3, 0 pmoldCas9 22 VEGFA off-target sgRNA D3, 20 pmol dCas9

Following the guidance of the present specification and examples, thedeep sequencing analysis described in this example can be practiced byone of ordinary skill in the art with other Cas9/dCas9 RNP complexes(i.e. assembled with distinct sgRNAs and/or distinct ratios of Cas9,dCas9, and sgRNA).

Example 3 Identification of Cas9 RNP Off-Target Loci

This example illustrates the method through which off-target Cas9nuclease sites may be identified. The method presented here is adaptedfrom Tsai et. al., “GUIDE-seq enables genome-wide profiling ofoff-target cleavage by CRISPR-Cas nucleases.,” Nat Biotechnol., 2015February; 33(2):187-97.

A. Identify a Target-Site of Interest:

A given locus in a genome of interest (i.e. a human genome) is screenedusing bioinformatics approaches known to those skilled in the art toidentify Cas9 target-sites. A 20 base pair target-site, followed by anNGG protospacer adjacent motif (PAM), is selected for nucleasetargeting.

B. Assemble GUIDE-Seq Components:

Oligos are obtained (Integrated DNA Technologies, Coralville, Iowa) forgenerating a blunt, double-stranded oligodeoxynucleotide (dsODN) thatwill be utilized for the GUIDE-Seq method. The dsODN containsphosphothiorate linkages at the 5′ ends of both DNA strands. The dsODNis assembled by incubating the two oligos in annealing buffer (i.e. 10mM Tris, pH 7.5-8.0, 50 mM NaCl, 1 mM EDTA) for 3 min at 95° C. andallowing the oligos to cool to RT.

C. Transfection of GUIDE-Seq Components:

Cells from a species of interest (i.e., human cells) are procured from acommercial repository (i.e. ATCC, DSMZ). Cells are grown to anappropriate density for transfection. Cells are transfected with ansgRNAs/Cas9 protein complex and the DNA donor oligo via methods known tothose skilled in the art (i.e. nucleofection or lipid transfection ofDNA plasmid encoding RNP components as well as dsODN).

D. Sequencing Library Preparation and Analysis:

gDNA is harvested 48hrs after cell transfection and purified usingAgencourt DNAdvance (Beckman Coulter, Pasadena, Calif.). Purified gDNAis fragmented with methods known to those skilled in the art (i.e.mechanical shearing via sonication or enzymatic shearing withNEBfragmentase, (New England Biolabs, Ipswich, Mass.)) to an averagelength of 500 base pairs, then end-repaired, A-tailed and ligated toadapters. PCR with primers complementary to the dsODN tag and illuminasequencing adapter sequences (Illumina, San Diego, Calif.),respectively, are used for target-enrichment. Target-enriched library issequenced using MiSeq Sequencer (Illumina, San Diego, Calif.). Reads aremapped back to the respective species' genome and read coverage iscalculated using BedTools (bedtools.readthedocs.org/en/latest/).Integrative Genomics Viewer (IGV, broadinstitute.org/igv/) is used tomap the starting (5′) and ending (3′) position of reads, and peaks arecalled using MACS2 (pypi.python.org/pypi/MACS2). The Sequencing data isused to confirm that a putative genomic locus is a candidate off-targetsequence. Following the guidance of the present examples, theidentification of novel off-target loci can be practiced by one ofordinary skill in the art.

Example 4 dCas9 Off-Target Blocking with Truncated Single-Guide RNAs(tru-gRNAs)

This example illustrates methods where dCas9-NBs may be assembled withtruncated guides. The method presented here is adapted from Fu Y et.al., “Improving CRISPR-Cas nuclease specificity using truncated guideRNAs,” Nat Biotechnol. 2014 March; 32(3):279-84. Truncated single-guideRNAs (tru-sgRNAs) of 17-18nt have been shown to possess increasedspecificity relative to 20 nt sgRNAs. Thus, a dCas9-NB assembled with atru-sgRNA may be targeted directly to a genomic motif and PAM of anoff-target locus to reduce off-target editing while having minimalinhibition of on-target editing.

A. Design of tru-sgRNA to Enable dCas9 Mediated Off-Target NucleaseBlocking:

Using methods described in Example 3 herein, a given off-target genomiclocus (i.e. spacer sequence) is identified. Next, a tru-sgRNA isdesigned to target said off-target location in the genome. Thetru-sgRNA/dCas9 RNP may target a sequence contained entirely within theoff-target motif, or it may target a sequence partially overlapping withthe off-target motif.

B. Production of dCas9 Nuclease Blocker Components:

dCas9 is assembled with a short (i.e. 17 nt) tru-sgRNA and Cas9 isassembled with a sgRNA (i.e. 20 nt) to produce functional RNPs. RNAcomponents are transcribed from DNA templates incorporating a T7promoter at the 5′ end as described in the Experimental section herein.dCas9 (D10A, H840A) and Cas9 proteins are recombinantly expressed in E.coli. RNPs are assembled by incubating protein and RNA componentstogether at 37° C. for 10 minutes.

C. Transfection of tru-sgRNA Containing dCas9-NB and sgRNA ContainingCas9 RNP:

Cells from species of interest are procured from a commercial repository(i.e. ATCC, DSMZ). Cells are grown to a level of confluency that enablestransfection. Tru-sgRNAs complexed with dCas9 are mixed with sgRNAsassembled with Cas9. Next, the mixture is transfected into a cell lineof interest using methods known to those skilled in the art (i.e.nucleofection or lipid transfection) as described in Example 1 herein.

D. Sequencing Library Preparation:

gDNA is then harvested 48 hours later using Quick Extract (Epicentre,Madison, Wis.) per the manufacturer's instructions. Two rounds of PCR,as described in Example 1 herein, are used to amplify and barcode thegenomic region targeted by the tru-sgRNA/dCas9-NB. Adapter oligos anddimers are removed by performing SPRlselect bead (Beckman Coulter,Pasadena, Calif.)-based clean up of the sequencing library. Sequencinglibrary concentration is determined by the Nanodrop™ 2000 system (ThermoScientific, Wilmington Del.).

E. Deep Sequencing Analysis:

The library is analyzed on MiSeq Sequencer as follows:

-   -   1. Reads are aligned to the human genome (build GRCh38/38) using        Bowtie (bowtie-bio.sourceforge.net/index.shtml) software.    -   2. Aligned reads are compared to wild type loci; reads not        aligning to any part of the loci are discarded.    -   3. Reads matching wild-type sequence are tallied. Reads with        indels are categorized by indel type and tallied.    -   4. Total indel reads are divided by the sum of wild-type reads        and indel reads are percent-mutated reads.

II. dCas9 Directed Positioning of Homology Directed Repair DonorsExample 5 Use of dCas9 to Position Homology Donor Nucleotides Near aTargeted Site for Increasing Homology Directed Repair (HDR) Efficiency

This system consists of a site-specific endonuclease (e.g, Cas9complexed with a sgRNA) that targets a DNA target sequence of interest),and one or more catalytically inactive dCas9 molecules complexed with asgRNA that targets DNA sequences adjacent to the cut site (See FIGS. 7Aand 7B). These dCas9 molecules are also tethered to a HDR molecule (i.e.dsDNA, ssDNA, RNA, a plasmid, or the like). The tethered dCas9 is usedto position the donor molecule in an orientation that will increase thelikelihood that the donor molecule will be incorporated into the targetsite through HDR, thereby introducing a desired change to the targetsequence.

A. DNA and RNA Constructs:

Oligonucleotides are ordered from manufacturers (e.g., Integrated DNATechnologies, Coralville, Iowa; or Eurofins, Luxembourg). sgRNAtranscription constructs are assembled by polymerase chain reaction(PCR).

The primers for sgRNA transcription constructs consist of a primercontaining a 5′ T7 promoter sequence, a primer containing a uniquespacer sequence, primers containing the sgRNA TRCR backbone, and areverse primer that may contain a complementary sequence to the homologydonor for tethering the donor to the 3′ end of the sgRNA.

T7 sgRNA transcription constructs are PCR-amplified. Two outer primers(forward oligo contains T7 promoter oligo; reverse oligos contain 3′ endof sgRNA backbone or homology donor complementary sequence fortethering) are present in PCR reaction at 640 nM. Unique spacer andsgRNA backbone oligos are present at 2 nM. PCR reactions are performedusing Q5 Hot Start High-Fidelity 2× Master Mix (New England Biolabs,Ipswich, MA) following manufacturer's recommendations. PCR T7 sgRNAtranscription construct assembly PCR is carried out using the followingthermal cycling conditions: 98° C. for 2 minutes, 29 cycles of 98° C.for 20 seconds, 62° C. for 20 seconds, and 72° C. for 15 seconds,followed by a final extension of 72° C. for 2 minutes. DNA constructsare evaluated by capillary electrophoresis (Fragment Analyzer, AdvancedAnalytical Technologies, Ames, Iowa).

RNA components are produced through in vitro transcription (T7 QuickHigh Yield RNA Synthesis Kit, New England Biolabs, Ipswich, MA) from adouble-stranded DNA template. The RNA is then treated with DNase I (NewEngland Biolabs, Ipswich, MA) to remove any double-stranded DNA andincubated at 37° C. for 10 minutes. 0.5 M EDTA is then added to thetranscription reactions and incubated at 75° C. for 10 minutes toinactivate the DNase I.

Homology donors are ordered as single-stranded DNA oligos ofapproximately 90 nucleotides in length. The homology donors arecomplementary to the coding sequence and are designed to be centered onthe cut site with the PAM replaced with a EcoR1 restriction enzyme siteand homology arms of approximately 42 nucleotides in length matching thetarget sequence.

B. sgRNA/Cas9 and sgRNA/dCas9 Complex Generation:

S. pyogenes catalytically active Cas9 and catalytically inactive dCas9are C-terminally tagged with two nuclear localization sequences (NLS)and recombinantly expressed in E. coli. All sgRNA and tethered sgRNA areincubated for 2 minutes at 95° C., removed from the thermal cycler andallowed to equilibrate to room temperature. Cas9 Ribonucleoprotein (RNP)complexes (also termed “sgRNA/Cas9 complex” and “sgRNA/dCas9 complex”herein) are set up in triplicate with 2 μM Cas9 or 2 μM dCas9, 6 μMsgRNA or 6 μM tethered sgRNA and 2 μM donor oligo in binding buffer (20mM HEPES, 100 mM KCl, 5 mM MgCl₂, 1 mM DTT, and 5% glycerol at pH 7.4)in a volume of 6 μl. The RNPs are then allowed to bind at 37° C. for 10minutes. After annealing, the Cas9 RNP and dCas9 RNP-donor tethers canbe combined to a final volume of 12 μl.

C. Experimental Conditions:

Experimental conditions for the use of various embodiments of theinvention are described below and illustrated in FIG. 10.

-   -   1) No transfection control—cells are not electroporated.    -   2) Cas9+standard sgRNA—cells are transfected with sgRNA and        catalytically active Cas9 targeting the locus of interest.    -   3) Cas9+standard sgRNA for target locus adjacent site—cells are        transfected with sgRNA and catalytically active Cas9 targeting        the locus adjacent site to determine if spacer and PAM at that        site provide good cleavage activity at that site    -   4) Donor only control—cells are transfected with the donor        polynucleotide to determine if there is incorporation through        HDR at the site    -   5) Tethered sgRNA/dCas9 and donor transfection—cells are        transfected with donor polynucleotide and with tethered        sgRNA/dCas9 complexes for the site adjacent to the target locus.    -   6) Tethered sgRNA/Cas9 at the target locus—cells are transfected        with catalytically active tethered sgRNA/Cas9 complex with donor        polynucleotide to the target locus.    -   7) Standard sgRNA/Cas9 and donor transfection—cells are        transfected with catalytically active sgRNA/Cas9 complex and        donor polynucleotide. This will determine the HDR incorporation        rates for a standard HDR experiment.    -   8) Standard sgRNA/Cas9 and tethered sgRNA/dCas9 (one site) and        donor transfection—cells are transfected with catalytically        active sgRNA/Cas9 complex targeting the target locus and with        tethered sgRNA/dCas9 targeting the target adjacent locus and        donor polynucleotide.    -   9) Standard sgRNA/Cas9 and tethered sgRNA/dCas9 (two sites) and        donor transfection—cells are transfected with catalytically        active sgRNA/Cas9 complex targeting the target locus and with        tethered sgRNA/dCas9 targeting upstream and downstream of the        target adjacent locus and donor polynucleotide. The two tethers        on the two sgRNA/dCas9 complexes stretch the donor        polynucleotide across the double-stranded break and make that        region available for HDR.

D. Cell Culture and Transfections:

K562 cells (ATCC, Manassas, Va.) are cultured in suspension in IMDMmedium supplemented with 10% FBS and 1% penicillin and streptomycin at37° C. with 100% humidity. K562 cells are transfected using theNucleofector® 96-well Shuttle System (Lonza, Allendale, N.J.). RNPs anddCas9 RNPs are arranged in a 96-well plate with 2 μl of Cas9 RNP or 4 μlof Cas9 and dCas9 RNP combined. K562 cells are transferred to a 50 mlconical centrifuge tube and centrifuged at 200×G for 3 minutes. Themedia is aspirated and the cell pellet washed in calcium andmagnesium-free PBS. The cells are centrifuged once more and thenresuspended in Nucleofector SF buffer (Lonza, Allendale, N.J.) at aconcentration of 1×10⁷ cells/ml. 20 μl of this cell suspension is addedto the RNP in the 96 well plate, mixed, and then the entire volume istransferred to a 96-well Nucleocuvette™ Plate (Lonza, Allendale, N.J.).The plate is then loaded into the Nucleofector™ 96-well Shuttle™ (Lonza,Allendale, N.J.) and cells are nucleofected using the 96-FF-120Nucleofector™ program (Lonza, Allendale, N.J.). Immediately followingnucleofection, 80 μl of complete IMDM medium is added to each well ofthe 96-well Nucleocuvette™ Plate. The entire contents of the well arethen transferred to a 96-well tissue culture plate containing 100 μl ofcomplete IMDM medium. The cells are cultured at 37° C. with 100%humidity conditions for 48 hours.

After 48 hours the K562 cells are centrifuged at 500×G for 5 minutes andthe medium is removed. The cells are washed 1 time in calcium andmagnesium-free PBS. The cell pellets are then resuspended in 50 μl ofQuickExtract DNA Extraction solutions (Epicentre, Madison, Wis.). ThegDNA samples obtained are then incubated at 37° C. for 10 minutes, 65°C. for 6 minutes and 95° C. for 3 minutes to stop the reaction. gDNAsamples are then diluted with 50 μl of water and stored at −20° C.

This gDNA is PCR-amplified using Q5 Hot Start High-Fidelity 2× MasterMix (New England Biolabs, Ipswich, Mass.) at 1× concentration, primersat 0.5 μM each, 3.75 μL of gDNA in a final volume of 10 L and amplified98° C. for 1 minutes, 35 cycles of 10 s at 98° C., 20 s at 60° C., 30 sat 72° C., and a final extension at 72° C. for 2 min. PCR reaction wasdiluted 1:100 in water.

A second “barcoding” PCR is set up using unique primers for each sample,facilitating multiplex sequencing. The second PCR is performed using Q5Hot Start High-Fidelity 2× Master Mix (New England Biolabs, Ipswich,Mass.) at 1× concentration, primers at 0.5 μM each, 1 μL of 1:100diluted first PCR, in a final volume of 10 μL and amplified 98° C. for 1minutes, 12 cycles of 10 s at 98° C., 20 s at 60° C., 30 s at 72° C.,and a final extension at 72° C. for 2 minutes.

E. SPRIselectclean-up:

PCR reactions are pooled into a single microfuge tube for SPRIselect™bead (Beckman Coulter, Pasadena, Calif.) based clean up of amplicons forsequencing.

To pooled amplicons, 0.9× volumes of SPRIselect™ beads are added, andmixed and incubated at room temperature (RT) for 10 minutes. Themicrofuge tube is placed on a magnetic tube stand (Beckman Coulter,Pasadena, Calif.) until the solution has cleared. Supernatant is removedand discarded, and the residual beads are washed with 1 volume of 85%ethanol, and incubated at RT for 30 s. After incubation, ethanol isaspirated and beads are air dried at RT for 10 min. The microfuge tubeis then removed from the magnetic stand and 0.25× volumes of Qiagen EBbuffer (Qiagen, Venlo, Netherlands) was added to bead, mixed vigorously,and incubated for 2 minutes at RT. The microfuge tube is returned to themagnet, incubated until solution has cleared and supernatant containingthe purified amplicons is dispensed into a clean microfuge tube. Thepurified amplicon library is quantified using the Nanodrop™ 2000 system(Thermo Fisher Scientific, Wilmington Del.) and library quality analyzedusing the Fragment Analyzer™ system (Advanced Analytical Technologies,Inc., Ames, Iowa) and the DNF-910 dsDNA Reagent Kit™ (AdvancedAnalytical Technologies, Inc. Ames, Iowa).

F. Deep Sequencing Set-Up:

The amplicon library is normalized to 4 nmolar concentration ascalculated from Nanodrop values and size of the amplicons. The libraryis analyzed on MiSeq Sequencer with MiSeq Reagent Kit v2™, 300 Cycles(Illumina, San Diego, Calif.), with two 151-cycle paired-end run plustwo eight-cycle index reads.

G. Deep Sequencing Data Analysis:

The identity of products in the sequencing data is analyzed based uponthe index barcode sequences adapted onto the amplicons in the secondround of PCR. A computational script is used to process the MiSeq databy executing the following tasks:

1. Reads are aligned to the human genome (build GRCh38/38) using Bowtie(bowtie-bio.sourceforge.net/index.shtml) software.

2. Aligned reads are compared to expected wild type target locussequence. Reads not aligning to any part of the target locus arediscarded.

3. Reads matching wild-type target sequence are tallied. Reads withindels are categorized by indel type and tallied.

4. Total indel reads are divided by the sum of wild-type reads and indelreads give percent-mutated reads.

This data is then analyzed to determine if sgRNA/dCas9 tethered donorpolynucleotides increase HDR efficiency compared to passively diffuseddonor polynucleotides.

III. Use of Cas9 Nickase Mutants to Increase the Efficiency of HomologyDirected Repair as a Fraction of Total Repair Events Example 6 Use ofTandom Cas9 Nickases to Direct Homology-Directed Repair at CleavageSites in Eukaryotic Cells

This example illustrates the use of a Cas9 nickase mutant where onenuclease domain is inactivated (Cas9D10A) to engage preferentiallyhomology-directed repair (HDR) pathways and block mutagenic repairpathways at break sites in eukaryotic cells. In this example Cas9D10A isused with two specific, single-guide RNAs (sgRNAs) that deliver thenickase to two sites on the same strand 30-60 nucleotides apart. Spacersequences were chosen from available sequences in human genomic DNA sothat each of the two sgRNAs would target Cas9 to a location on eitherside of the desired region for modification.

Production of Cas9D10A Nickase and Cas9 Nuclease Components:

sgRNA components of Cas9D10A Ribonucleoprotein (RNP) complexes (alsotermed “sgRNA/Cas9 nickase complexes” herein) and catalytically activeCas9 nuclease RNP complexes (also termed “sgRNA/Cas9 complexes” herein)were produced by in vitro transcription (e.g., T7 Quick High Yield RNASynthesis Kit, New England Biolabs, Ipswich, Mass.) from double-strandedDNA templates incorporating a T7 promoter at the 5′ end of the DNAsequence. Polymerase Chain Reaction (PCR) using 5 overlapping primersassembled the double-stranded DNA templates for the sgRNA components.The oligonucleotides used in the assembly are presented in Table 6.

TABLE 6 Overlapping Primers for Generating Cas9D10A andCas9 Nuclease sgRNA Component Templates Target for Component DNA bindingPrimers Cas9 and CD34 Target 1 A, B, C, D, E Cas9D10A  sgRNA Cas9 andCD34 Target 2 A, B, C, D, F Cas9D10A sgRNA Cas9 and CD34 Target 3A, B, C, D, G Cas9D10A sgRNA Cas9 and CD34 Target 4 A, B, C, D, H,Cas9D10A sgRNA Cas9 and CD34 Target 5 A, B, C, D, I Cas9D10A sgRNA AGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCT AGTCCGTTATCAAC (SEQ ID NO: 3) BAAAAAAAGCACCGACTCGGTGCCACTTTTTCAAGTT GATAACGGACTAGC (SEQ ID NO: 4) CAAAAAAAGCACCGACTCGGTGCC (SEQ ID NO: 1) D AGTAATAATACGACTCACTATAG(SEQ ID NO: 2) E TAATACGACTCACTATAGGAACACTGTGCTGATTACAGGTTTTAGAGCTAGAAATAGC  (SEQ ID NO: 34) FTAATACGACTCACTATAGGTTTGTGTTTCCATAAAC TGGTTTTAGAGCTAGAAATAGC(SEQ ID NO: 35) G TAATACGACTCACTATAGGCTACTAACTTGAGCTCCCCGTTTTAGAGCTAGAAATAGC (SEQ ID NO: 36) HTAATACGACTCACTATAGTCCCAAAGGCGGAGGGCG TTGTTTTAGAGCTAGAAATAGC(SEQ ID NO: 37) I TAATACGACTCACTATAGAGGCTGGGTTGCCGCCGTCGGTTTTAGAGCTAGAAATAGC (SEQ ID NO: 38)

The DNA primers were present at a concentration of 2 nM each. Two outerDNA primers corresponding to the T7 promoter (forward primer:Oligonucleotide A, Table 1), and the 3′ end of the RNA sequence (reverseprimers: Oligonucleotides C, Table 1) were used at 640 nM to drive theamplification reaction. PCR reactions were performed using KapaHiFiHotstart PCR™ kit (Kapa Biosystems, Inc., Wilmington, Mass.) as permanufacturer's recommendation. PCR assembly reactions were carried outusing the following thermal cycling conditions: 98° C. for 2 minutes, 35cycles of 15 seconds at 98° C., 15 seconds at 62° C., 15 seconds at 72°C., and a final extension at 72° C. for 2 min. Following the PCRreaction, the quantity of PCR product was determined using capillaryelectrophoresis on a Fragment Analyzer (Advanced AnalyticalTechnologies, Inc., Ames, Iowa).

Between 0.25-0.5 μg of the DNA template for the sgRNA components weretranscribed using T7 High Yield RNA synthesis Kit (New England Biolabs,Ipswich, Mass.) for approximately 16 hours at 37° C. Transcriptionreactions were DNAse I treated (New England Biolabs, Ipswich, Mass.).The quality of the transcribed RNA was checked by capillaryelectrophoresis on a Fragment Analyzer (Advanced AnalyticalTechnologies, Inc., Ames, Iowa). Protein components of RNPs wereexpressed from bacterial expression vectors in E. coli (BL21 (DE3)) andpurified using affinity, ion exchange and size exclusion chromatographyaccording to methods described in Jinek, M., et al., “A programmabledual-RNA-guided DNA endonuclease in adaptive bacterial immunity,”Science 337 (2012) 816-821. This method for production of Cas9 and/orCas9D10A/Cas9H840A RNPs can be applied to the production of other Cas9and/or Cas9D10A/Cas9H840A RNPs as described herein. The coding sequencefor S. pyogenes Cas9 included the two nuclear localization sequences(NLS) at the C-terminus. Cas9D10A or Cas9H840A nickase variants ofNLS-tagged Cas9, where an active site residue of either nuclease domainis mutated (Jinek, et al., 2012) were prepared by introducing mutationsinto the coding sequence of S. pyogenes Cas9 by site directedmutagenesis (e.g. Q5 Site-directed Mutagenesis Kit, New England Biolabs,Ipswich, Mass.).

Example 7 Deep Sequencing Analysis for Detection of Target Modificationsin Eukaryotic Cells

This example illustrates the use of a MiSeq Sequencer (Illumina, SanDiego, Calif.) for deep sequencing analysis to quantify total editingevents initiated by DNA cleavage by Cas9 or Cas9D10A and compare DNArepair types. Example DNA repair types can include mutagenic end-joiningpathways such as non-homologous end joining (NHEJ) or insertion ofmaterial from a donor sequence by homology directed repair (HDR). Inthis example, Cas9 and Cas9D10A were directed to the human gene CD34 atfive independent sites by specific sgRNAs.

A. Transfection of Cas9/Cas9D10A RNPs:

To assemble Cas9 and Cas9D10A RNPs, 1.36 μl of sgRNA (corresponding toapproximately 1-5 μg) were incubated for 2 minutes at 95° C. thenallowed to equilibrate to room temperature for approximately 5 minutes.Subsequently, Cas9 and Cas9D10A were mixed with a corresponding sgRNA toform RNPs in reaction buffer (20 mM HEPES, pH 7.5, 100 mM KCL, 5 mMMgCl₂, 5% glycerol). 20 pmols of Cas9 or Cas9D10A were combined with thetarget sgRNA and functional RNPs were assembled by incubating at 37° C.for 10 minutes. Finally, 20 pmols of Cas9 or Cas9D10A RNP was combinedwith 100 pmols of DNA donor oligonucleotide template for HDR prior totransfection into cells. Experiments were performed in triplicate.

TABLE 7 DNA Oligonucleotide Donor Templates CD34TCAGTTTATGGAAACACAAACTCTTCTGTCCAGTCACAGA Target 1gaattcCTGTAATCAGCACAGTGTTCACCACCCCAGCCAA TCGTTCAACT (SEQ ID NO: 39) CD34CCAGAAACGACAGTCAAATTCACATCTACCTCTGTGATAA Target 2gaattcCAGTTTATGGAAACACAAACTCTTCTGTCCAGTC ACAGACCTCT (SEQ ID NO: 40) CD34ACCCAGCCTCCCTCCTAACGCCCTCCGCCTTTGGGACCAA Target 3gaattcGGGGAGCTCAAGTTAGTAGCAGCCAAGGAGAGGC GCTGCCTTGC (SEQ ID NO: 41) CD34CCACCTTTTTTGGCCTCGACGGCGGCAACCCAGCCTCCCT Target 4gaattcAACGCCCTCCGCCTTTGGGACCAACCAGGGGAGC TCAAGTTAGT (SEQ ID NO: 42) CD34CGAGGCATCTGGAGCCCGAACAAACCTCCACCTTTTTTGG Target 5gaattcCGACGGCGGCAACCCAGCCTCCCTCCTAACGCCC TCCGCCTTTG (SEQ ID NO: 43) CD34CACATCTACCTCTGTGATAAgCTCAGTTTATGGAAttCAC TargetsAAAACTCTTCTGTCCGTCACAGAgCTCTGTAATCAGCACA 1 + 2 GTGTTCACCA Cas9D10A(SEQ ID NO: 44) CD34 CCTCGACGGCGGCAACCCAGCCTCCCTgCTAACGCCCTCC TargetsGaaTTcTGGGACCAAgCAGGGGAGCTCAAGTTAGTAGCAG 3 + 4 CCAAGGAGAG Cas9D10A(SEQ ID NO: 45) CD34 CCGAACAAACCTCCACCTTTTTTGGCgTCGACGGCGGCAA TargetsCCgAattCCTCCCTCgTAACGCCCTCCGCCTTTGGGACCA 4 + 5 ACCAGGGGAG Cas9D10A(SEQ ID NO: 46) CD34 CCACCTTTTTTGGgCTCGACGGCGGCAACCCAGCCTCCCT TargetsCCgAAttCGCCCTCCGCCTTTGGGACCAAgCAGGGGAGCT 3 + 5 CAAGTTAGTA Cas9D10A(SEQ ID NO: 47)

Cas9/Cas9D10ARNP complexes were transfected into K562 cells (ATCC,Manassas, Va.), using the Nucleofector® 96-well Shuttle System (Lonza,Allendale, N.J.) and the following protocol: RNP and RNP plus donorcomplexes were dispensed in a 2-3 μL final volume into individual wellsof a 96-well plate. K562 cells suspended in media were transferred fromculture flask to a 50 mL conical, cells were then pelleted bycentrifugation for 3 minutes at 200×g, the culture medium aspirated andwashed once with calcium and magnesium-free PBS. K562 cells were thenpelleted by centrifugation for 3 minutes at 200×g, the PBS aspirated andcell pellet was resuspended in 10 mL of calcium and magnesium-free PBS.

K562 cells were counted using the Countess® II Automated Cell Counter™(Life Technologies, Grand Island, N.Y.). 2.2×10⁷ cells were transferredto a 50 ml tube and pelleted. The PBS was aspirated and the cells wereresuspended in Nucleofector™ SF (Lonza, Allendale, N.J.) solution to adensity of 1×10⁷ cells/mL. 20 μL of the cell suspension are then addedto individual wells containing 2-3 μL of RNP and RNP plus Donorcomplexes and the entire volume was transferred to the wells of a96-well Nucleocuvette™ Plate (Lonza, Allendale, N.J.). The plate wasloaded onto the Nucleofector™ 96-well Shuttle™ (Lonza, Allendale, N.J.)and cells were nucleofected using the 96-FF-120 Nucleofector™ program(Lonza, Allendale, N.J.). Post-nucleofection, 80 μL Iscove's ModifiedDulbecco's Media (IMDM, Life Technologies, Grand Island, N.Y.),supplemented with 10% FBS (Fisher Scientific, Pittsburgh, Pa.) andsupplemented with penicillin and streptomycin (Life Technologies, GrandIsland, N.Y.), was added to each well and 50 μL of the cell suspensionwas transferred to a 96-well cell culture plate containing 150 μLpre-warmed IMDM complete culture medium. The plate was then transferredto a tissue culture incubator and maintained at 37° C. in 5% CO₂ forapproximately 48 hours.

Genomic DNA (gDNA) was isolated from K562 cells 48 hours afterCas9/Cas9D10A transfection using 50 μL QuickExtract DNA Extractionsolution (Epicentre, Madison, Wis.) per well followed by incubation at37° C. for 10 minutes. 50 μL water was added to the samples, and nextthey were incubated at 75° C. for 10 minutes and 95° C. for 5 minutes tostop the reaction. gDNA was stored at −80° C. until further processing.

B. Sequencing Library Preparation:

Using previously isolated gDNA, a first PCR was performed using Q5 HotStart High-Fidelity 2× Master Mix™ (New England Biolabs, Ipswich, Mass.)at 1× concentration, primers at 0.5 μM each, 3.75 μL of gDNA in a finalvolume of 10 L and amplified 98° C. for 1 minute, 35 cycles of 10 s at98° C., 20 s at 60° C., 30 s at 72° C., and a final extension at 72° C.for 2 minutes. PCR reaction was diluted 1:100 in water. Target-specificprimers are shown in Table 8.

TABLE 8 Target-specific Primers Used CD34GGAGTTCAGACGTGTGCTCTTCCGATCTTGCAA Target GGCTAGTGCTAGTGG 1_F(SEQ ID NO: 48) CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTA TargetCATGCACACCCATGTTTTG 1_R (SEQ ID NO: 49) CD34GGAGTTCAGACGTGTGCTCTTCCGATCTAACAT Target TTCCAGGTGACAGGC 2_F(SEQ ID NO: 50) CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTA TargetCATGCACACCCATGTTTTG 2_R (SEQ ID NO: 51) CD34GGAGTTCAGACGTGTGCTCTTCCGATCTGTGGG Target GGATTCTTGCTTTTT 3_F(SEQ ID NO: 52) CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTC TargetTCCAGAAAGCTGAACGAGG 3_R (SEQ ID NO: 53) CD34GGAGTTCAGACGTGTGCTCTTCCGATCTTTTCC Target TCTCTTCTCCCCTCC 4_F(SEQ ID NO: 54) CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTC TargetTGCCACAAAGGGGTTAAAA 4_R (SEQ ID NO: 55) CD34GGAGTTCAGACGTGTGCTCTTCCGATCTTTTCC Target TCTCTTCTCCCCTCC 5_F(SEQ ID NO: 56) CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTC TargetTGCCACAAAGGGGTTAAAA 5_R (SEQ ID NO: 57) CD34GGAGTTCAGACGTGTGCTCTTCCGATCTTGCAA Targets GGCTAGTGCTAGTGG 1 + 2(SEQ ID NO: 58) Cas9D10A_F CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTCTargets ACATGCACACCCATGTTTT 1 + 2 (SEQ ID NO: 59) Cas9D10A_R CD34GGAGTTCAGACGTGTGCTCTTCCGATCTTCTCT Targets TCTCCCCTCCCTTTT 3-5(SEQ ID NO: 60) Cas9D10A_F CD34 CACTCTTTCCCTACACGACGCTCTTCCGATCTGTargets CCACAAAGGGGTTAAAAGTT 3-5 (SEQ ID NO: 61) Cas9D10A_R

A second ‘barcoding’ PCR was set up using unique primers for each samplefacilitating multiplex sequencing. Primer pairs are shown in Table 9.

TABLE 9 Barcoding Primers ILMN_AMP_AATGATACGGCGACCACCGAGATCTACACTGAACCTT FORi5_BC9 ACACTCTTTCCCTACACGACG(SEQ ID NO: 62) ILMN_AMP_ AATGATACGGCGACCACCGAGATCTACACTGCTAAGTFORi5_BC10 ACACTCTTTCCCTACACGACG (SEQ ID NO: 63) ILMN_AMP_AATGATACGGCGACCACCGAGATCTACACTAAGTTCC FORi5_BC11 ACACTCTTTCCCTACACGACG(SEQ ID NO: 64) ILMN_AMP_ AATGATACGGCGACCACCGAGATCTACACATAGAGGCFORi5_BC12 ACACTCTTTCCCTACACGACG (SEQ ID NO: 65) ILMN_AMP_AATGATACGGCGACCACCGAGATCTACACGGCTCTGA FORi5_BC13 ACACTCTTTCCCTACACGACG(SEQ ID NO: 66) ILMN_AMP_ AATGATACGGCGACCACCGAGATCTACACAGGCGAAGFORi5_BC14 ACACTCTTTCCCTACACGACG (SEQ ID No: 67) ILMN_AMP_AATGATACGGCGACCACCGAGATCTACACTAATCTTA FORi5_BC15 ACACTCTTTCCCTACACGACG(SEQ ID NO: 68) ILMN_AMP_ AATGATACGGCGACCACCGAGATCTACACCAGGACGTFORi5_BC16 ACACTCTTTCCCTACACGACG (SEQ ID NO: 69) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATACATCGGTGAC REVi7_BC49 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 70) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATGCCTAAGTGACREVi7_BC50 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 71) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATTCAAGTGTGAC REVi7_BC51 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 72) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATCTGATCGTGACREVi7_BC52 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 73) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATGTAGCCGTGAC REVi7_BC53 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 74) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATTTGACTGTGACREVi7_BC54 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 75) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATGGAACTGTGAC REVi7_BC55 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 76) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATTGACATGTGACREVi7_BC56 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 77) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATGGACGGGTGAC REVi7_BC57 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 78) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATCCACTCGTGACREVi7_BC58 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 79) ILMN_AMP_CAAGCAGAAGACGGCATACGAGATATCTTTTGGTGAC REVi7-BC59 TGGAGTTCAGACGTGTGCTC(SEQ ID NO: 80) ILMN_AMP_ CAAGCAGAAGACGGCATACGAGATATTGAGTGGTGACREVi7_BC60 TGGAGTTCAGACGTGTGCTC (SEQ ID NO: 81)

The second PCR was performed using Q5 Hot Start High-Fidelity 2× MasterMix™ (New England Biolabs, Ipswich, Mass.) at 1× concentration, primersat 0.5 μM each, 1 μL of 1:100 diluted first PCR, in a final volume of 10μL and amplified 98° C. for 1 minute, 12 cycles of 10 s at 98° C., 20 sat 60° C., 30 s at 72° C., and a final extension at 72° C. for 2minutes. PCR reactions were pooled into a single microfuge tube forSPRIselect™ bead (Beckman Coulter, Pasadena, Calif.) based clean up ofamplicons for sequencing.

To pooled amplicons, 0.9× volumes of SPRIselect™ beads were added, mixedand incubated at room temperature (RT) for 10 minutes. The microfugetube was placed on a magnetic tube stand (Beckman Coulter, Pasadena,Calif.) until the solution had cleared. Supernatant was removed anddiscarded, and the residual beads were washed with 1 volume of 85%ethanol, and incubated at RT for 30 s. After incubation, ethanol wasaspirated and beads were air dried at RT for 10 min. The microfuge tubewas then removed from the magnetic stand and 0.25× volumes of water(Qiagen, Venlo, Limburg) was added to the bead, mixed vigorously, andincubated for 2 minutes at RT. The microfuge tube was spun in amicrocentrifuge to collect the contents of the tube, and was thenreturned to the magnet, incubated until the solution had cleared, andthe supernatant containing the purified amplicons were dispensed into aclean microfuge tube. The purified amplicon library was quantified usingthe Nanodrop™ 2000 system (Thermo Scientific, Wilmington Del.).

The amplicon library was normalized to 4 nmolar concentration ascalculated from Nanodrop values and size of the amplicons. The librarywas analyzed on MiSeq Sequencer with MiSeq Reagent Kit v2™, 300 Cycles(Illumina, San Diego), with two 151-cycle paired-end run plus twoeight-cycle index reads.

C. Deep Sequencing Data Analysis:

The identity of products in the sequencing data was analyzed based uponthe index barcode sequences adapted onto the amplicons in the secondround of PCR. A computational script was used to process the MiSeq databy executing the following tasks:

-   -   1. Reads were aligned to the human genome (build GRCh38/38)        using Bowtie (bowtie-bio.sourceforge.net/index.shtml) software.    -   2. Aligned reads were compared to wild type loci. Reads not        aligning to any part of the loci were discarded.    -   3. Reads matching wild-type sequence were tallied. Reads with        indels (surrounding the Cas9 cut site) were categorized by indel        type and tallied.    -   4. Total indel reads were divided by the sum of wild-type reads        and indel reads gave percent-mutated reads.

Indel structures were compared between sequence data that was generatedfrom cells transfected with wild-type Cas9 RNP or Cas9 RNP+Donor, foreach of the individual targets, and for Cas9D10A RNP and Cas9D10ARNP+Donor for each of the pairs of targets. The experimental datademonstrated that cells transfected with Cas9 RNP exhibited a number ofclasses of mutant edits. Cas9 RNP+Donor showed a similar spectrum ofmutant edits and donor-dependent edits, whereas cells transfected withCas9D10A RNP only, showed no evidence of editing but Cas9D10A RNP+Donordemonstrated similar levels of donor insertion to the Cas9 RNP+Donor,but with no measurable mutant edits that could not be attributed toincorporation of the donor sequence.

FIG. 12 shows a comparison of repair types using either Cas9 or Cas9D10Aat Targets 3 and 4 (human CD34 locus). Cas9 RNP complexed with sgRNA wasused to target either CD34 Target 3 or CD34 Target 4. Cas9D10A RNPscomplexed with sgRNA were used to target CD34 Target 3 and Target 4.Negative controls were Cas9 or Cas9D10A only, without sgRNA. Thedistribution of repair is shown by the bars. As can be seen, Cas9 RNPdisplayed only mutagenic repair. Cas9 RNP+Donor demonstrated mutagenicrepair and HDR, whereas Cas9D10A RNP showed barely detectable mutagenicrepair. Cas9D10A RNP+Donor demonstrated HDR edits with barely detectablemutagenic repair.

Table 10 contains an average of three replicates (excluding negativecontrols n=2) and standard deviation (STD) of each class.

TABLE 10 Data Used in FIG. 12 % % Muta- Uned- Uned- genic % ited MUT HDRSample Nuclease ited Repair HDR STD STD STD Target 3 Cas9 7.7 92.3 00.57 0.57 0 Target 4 Cas9 58.3 41.7 0 2.39 2.3 0 Target Cas9 2 59.7 38.30 2.31 2.31 3 + Donor Target Cas9 22.3 54.3 23.3 2.08 2.51 1.52 4 +Donor Target Cas9 100 0 0 ND ND ND 3 neg Target Cas9 100 0 0 ND ND ND 4neg Target Cas9D10A 99.3 0.7 0 0.05 0.05 0 3 + 4 Target Cas9D10A 81.80.5 17.6 1.52 0 1.52 3 + 4 Donor Target Cas9D10A 99.4 0.6 0 ND ND ND 3 +4 neg

Following the guidance of the present specification and examples, thedeep sequencing analysis described in this example can be practiced byone of ordinary skill in the art with other Cas9/Cas9D10A RNP complexes(i.e. assembled with distinct sgRNAs and distinct ratios ofCas9/Cas9D10A and donor oligonucleotide templates).

Example 8 Use of Paired Cas9D10A or Paired Cas9H840A Tandem Nickases toEnhance the Proportion of HDR-specific Edits at a Break Site

This example illustrates the use of a Cas9 nickase mutant where onenuclease domain will be inactivated (either Cas9D10A or Cas9H840A) toengage preferentially HDR pathways and block mutagenic repair pathwaysat break sites in eukaryotic cells. In this example, spacer sequencesfor the two sgRNA sequences are chosen to vary the length of thedeletion around the desired target site. Sequences are chosen such thatthe paired nickases are targeted to two sites on the same strand varyingthe distance between two sites in a range from 20 to 2000 nucleotidesapart. Donor polynucleotides are designed with different lengths andpositions relative to the locations of the spacer sequences and testedin combination with each pair of Cas9 nickase sgRNPs. Using the methodsdescribed in Examples 6 and 7, experiments are conducted to measure thefrequency and type of DNA repair that takes place with each combinationof paired nickases. Data are analyzed to identify the combination ofnickase sgRNPs and donor polynucleotide that leads to the highestfrequency of HDR with the lowest frequency of mutant editing.

Example 9 Use of Paired Cas9D10A or Paired Cas9H840A Tandem Nickases toEnhance the Proportion of HDR-Specific Edits at a Break Site andIntroduce Different, Specific Nucleotide Insertions or Deletions

This example illustrates the use of a Cas9 nickase mutant where onenuclease domain will be inactivated (either Cas9D10A or Cas9H840A) toengage preferentially HDR pathways and block mutagenic repair pathwaysat break sites in eukaryotic cells. In this example, either pairedCas9D10A or paired Cas9H840A are used with two specific, sgRNAs thatdeliver the paired nickases to two sites on the same strand 20-2000nucleotides apart. Donor oligonucleotides are designed to deliverspecific nucleotide insertions or deletions at the desired site (FIG.11). Experiments are carried out varying spacing between nickases andvarying donor sequence and length as described in Example 8 to identifythe combination of reagents leading to the highest frequency of HDR andlowest frequency of mutagenic repair to introduce the intendedmodification at the desired site.

Example 10 Use of Paired Cas9D10A or Paired Cas9H840A Tandem Nickases toEnhance the Proportion of HDR-Specific Edits at a Break Site in HumanPrimary Cells with Various Donor Configurations

This example illustrates the use of a Cas9 nickase mutant where onenuclease domain will be inactivated (either Cas9D10A or Cas9H840A) toengage exclusively HDR pathways and block mutagenic repair pathways atbreak sites in eukaryotic cells. In this example, either paired Cas9D10Aor paired Cas9H840A can be used in tandem complexed with two specificsgRNAs that deliver the paired nickases to two sites on the same strand20-2000 nucleotides apart. The donor oligonucleotides are provided indifferent orientations and/or lengths to deliver specific nucleotideinsertions or deletions between two target Cas9-nickase sites in humanprimary cells for therapeutic advantage.

Example 11 Use of Paired Cas9D10A and Cas9H840A Tandem Nickases toEnhance the Proportion of HDR-Specific Edits at a Break Site in HumanPrimary Cells with Various Donor Configurations

This example illustrates the use of pairs of Cas9 nickase mutants toengage preferentially homology-directed repair pathways and blockmutagenic repair pathways at break sites in eukaryotic cells. In thisexample, Cas9D10A and Cas9H840A are used in combination with twospecific sgRNAs that deliver the paired nickases to two sites resultingin nicking of the same strand 20-2000 nucleotides apart. The sgRNAspaired with Cas9D10A must be chosen to target protospacer sequences andPAM sequences on one strand. The sgRNAs paired with Cas9H840A must bechosen to target protospacer sequences and PAM sequences on the oppositestrand to the Cas9D10A sgRNAs to ensure that the same strand is nickedtwice. sgRNPs are assembled separately for each nickase mutant bycombining the protein with the selected sgRNA. Donor oligonucleotidesare designed to deliver specific nucleotide insertions or deletions atthe desired site (FIG. 11) and synthesized by an oligonucleotidemanufacturer (e.g. Integrated DNA Technologies, Coralville, Iowa).Cas9D10a-sgRNPs are mixed with Cas9H840A-sgRNPs before transfection andthe pair of nickases targeting the same strand transfected together withthe donor oligonucleotide using methods described in above examples.Experiments are carried out varying spacing between nickases and varyingdonor sequence and length as described in Example 8 to identify thecombination of reagents leading to the highest frequency of HDR andlowest frequency of mutagenic repair to introduce the intendedmodification at the desired site.

Although preferred embodiments of the subject methods have beendescribed in some detail, it is understood that obvious variations canbe made without departing from the spirit and the scope of the methodsas defined by the appended claims.

1-68. (canceled)
 69. A method for positioning a donor polynucleotidenear a cleavage site, comprising: contacting a first complex with afirst target nucleic acid comprising the cleavage site, wherein thefirst complex comprises a catalytically active Cas9 protein and a firstguide polynucleotide that comprises a spacer that binds to the firsttarget nucleic acid, and the first complex binds and cleaves the firsttarget nucleic acid at the cleavage site; and contacting a secondcomplex with a second target nucleic acid, wherein the second complexcomprises a catalytically inactive Cas9 protein (dCas9 protein), and asecond guide polynucleotide that comprises a spacer that binds to thesecond target nucleic acid, wherein the second target nucleic acid is inproximity to the cleavage site, the second complex is associated with afirst end of a donor polynucleotide, and the second complex binds butdoes not cleave the second target nucleic acid; wherein binding of thesecond complex positions the donor polynucleotide near the cleavagesite, and at least a portion of the donor polynucleotide is insertedinto the first target nucleic acid.
 70. The method of claim 69, whereinthe first guide polynucleotide is a single-guide RNA (sgRNA).
 71. Themethod of claim 69, wherein the second guide polynucleotide is adual-guide RNA.
 72. The method of claim 69, further comprising:contacting a third complex with a third target nucleic acid, wherein thethird complex comprises a dCas9 protein, and a third guidepolynucleotide that comprises a spacer that binds to the third targetnucleic acid, wherein the second target is located upstream of thecleavage site, the third target nucleic acid is located downstream ofthe cleavage site, the third complex is associated with a second end ofthe donor polynucleotide, and the third complex binds but does notcleave the third target nucleic acid.
 73. The method of claim 72,wherein the third guide polynucleotide is a sgRNA.
 74. The method ofclaim 69, wherein the catalytically active Cas9 protein is selected fromthe group consisting of a Streptococcus pyogenes Cas9 protein, aStreptococcus thermophilus Cas9 protein, a Staphylococcus aureus Cas9protein, a Neisseria meningitidis Cas9 protein, and an orthologous Cas9protein.
 75. The method of claim 74, wherein the catalytically activeCas9 protein is the Streptococcus pyogenes Cas9 protein.
 76. The methodof claim 69, wherein the dCas9 protein is selected from the groupconsisting of a Streptococcus pyogenes dCas9 protein, a Streptococcusthermophilus dCas9 protein, a Staphylococcus aureus dCas9 protein, aNeisseria meningitidis dCas9 protein, and an orthologous dCas9 protein.77. The method of claim 76, wherein the dCas9 protein is theStreptococcus pyogenes dCas9 protein.
 78. The method of claim 72,wherein the dCas9 protein is selected from the group consisting of aStreptococcus pyogenes dCas9 protein, a Streptococcus thermophilus dCas9protein, a Staphylococcus aureus dCas9 protein, a Neisseria meningitidisdCas9 protein, and an orthologous dCas9 protein.
 79. The method of claim78, wherein the dCas9 protein is the Streptococcus pyogenes dCas9protein.
 80. The method of claim 69, wherein the first target nucleicacid, the second target nucleic acid, and the third target nucleic acidcomprise a double-stranded DNA.
 81. The method of claim 69, wherein themethod is performed in vitro.
 82. The method of claim 69, wherein thedonor polynucleotide is a single-stranded DNA.
 83. The method of claim69, wherein the donor polynucleotide is a double-stranded DNA.
 84. Amethod for positioning a donor polynucleotide near a cleavage site ingenomic DNA of a cell, comprising: introducing into the cell a firstcomplex comprising a catalytically active Cas9 protein and a first guidepolynucleotide that comprises a spacer that binds to a first targetnucleic acid in the genomic DNA, the first target nucleic acidcomprising the cleavage site, and a second complex comprising acatalytically inactive Cas9 protein (dCas9 protein) and a second guidepolynucleotide that comprises a spacer that binds to a second targetnucleic acid in proximity to the cleavage site, and the second complexis associated with a first end of a donor polynucleotide; wherein thefirst complex contacting the first target nucleic acid facilitatesbinding and cleaving the first target nucleic acid at the cleavage site;and wherein the second complex contacting the second target nucleic acidfacilitates binding to the second target nucleic acid, binding of thesecond complex positions the donor polynucleotide near the cleavagesite, and at least a portion of the donor polynucleotide is insertedinto the first target nucleic acid.
 85. The method of claim 84, whereinthe genomic DNA is double-stranded DNA.
 86. The method of claim 84,wherein the first complex and the second complex are introduced into thecell by a method selected from the group consisting of transfection,transduction, electroporation, liposome delivery, lipid nanoparticles,and viral delivery.
 87. The method of claim 84, wherein the cell is aeukaryotic cell.
 88. The method of claim 84, wherein the cell isselected from the group consisting of a bacterial cell, a yeast cell, amammalian cell, and a plant cell.